학술논문

Robustness and Reliability When Training With Noisy Labels
Document Type
Author
Source
Proceedings of the 25th International Conference on Artificial Intelligence and Statistics (AISTATS) 2022 The proceedings of Machine Learning research. :922-942
Subject
Language
English
ISSN
2640-3498
Abstract
Labelling of data for supervised learning canbe costly and time-consuming and the riskof incorporating label noise in large data setsis imminent. When training a flexible discriminative model using a strictly proper loss,such noise will inevitably shift the solution towards the conditional distribution over noisylabels. Nevertheless, while deep neural networks have proven capable of fitting randomlabels, regularisation and the use of robustloss functions empirically mitigate the effectsof label noise. However, such observationsconcern robustness in accuracy, which is insufficient if reliable uncertainty quantificationis critical. We demonstrate this by analysingthe properties of the conditional distributionover noisy labels for an input-dependent noisemodel. In addition, we evaluate the set ofrobust loss functions characterised by noiseinsensitive, asymptotic risk minimisers. Wefind that strictly proper and robust loss functions both offer asymptotic robustness in accuracy, but neither guarantee that the finalmodel is calibrated. Moreover, even with robust loss functions, overfitting is an issue inpractice. With these results, we aim to explain observed robustness of common training practices, such as early stopping, to labelnoise. In addition, we aim to encourage thedevelopment of new noise-robust algorithmsthat not only preserve accuracy but that alsoensure reliability.