학술논문

Relative Prediction Probability based Label Correction Method for DNN when Learning with Noisy Labels in Class Imbalanced Dataset
Document Type
Conference
Source
2023 3rd International Conference on Digital Society and Intelligent Systems (DSInS) Digital Society and Intelligent Systems (DSInS), 2023 3rd International Conference on. :363-367 Nov, 2023
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Learning with noisy labels
deep neural networks
label correction
relative prediction probability
Language
Abstract
The noisy label problem is a common challenge for Deep Neural Networks (DNNs). Popular Learning with Noisy Labels (LNL) methods frequently utilize DNN’s prediction probability to detect and mitigate label noise. However, this prediction probability can be biased by class imbalanced training datasets, which are common in real-world scenarios. Although there are existing LNL methods that utilize the sampling strategy to prevent DNN’s prediction probability from being biased when mitigating label noise, their sampling strategy will potentially disrupt the DNN’s learning process and decrease DNN’s performance. In this paper, we propose the Relative Prediction Probability based Label Correction (RPPLC) method to mitigate label noise in class imbalanced datasets without harming DNN’s learning process. Specifically, we propose to subtract data’s logits output from the average logits output of its corresponding label category to calculate the data’s relative prediction probability. Then we use this probability to detect and correct the label noise. Compared with LNL methods that incorporating the sampling strategy, our method will not interfere with the DNN’s learning process and can provide more accurate prediction probability while mitigating label noise in imbalanced datasets. Thus, it can better improve DNN’s performance in practice. We evaluated our method on commonly used class imbalanced datasets that contain noisy labels, and the experimental results demonstrate that our method achieved better performance than baseline LNL methods and LNL methods that incorporating sampling strategy.