학술논문

Revisiting Confidence Estimation: Towards Reliable Failure Prediction
Document Type
Periodical
Source
IEEE Transactions on Pattern Analysis and Machine Intelligence IEEE Trans. Pattern Anal. Mach. Intell. Pattern Analysis and Machine Intelligence, IEEE Transactions on. 46(5):3370-3387 May, 2024
Subject
Computing and Processing
Bioengineering
Calibration
Estimation
Reliability
Predictive models
Training
Task analysis
Machine learning
Confidence estimation
uncertainty quantification
failure prediction
misclassification detection
selective classification
out-of-distribution detection
confidence calibration
model reliability
trustworthy
flat minima
Language
ISSN
0162-8828
2160-9292
1939-3539
Abstract
Reliable confidence estimation is a challenging yet fundamental requirement in many risk-sensitive applications. However, modern deep neural networks are often overconfident for their incorrect predictions, i.e., misclassified samples from known classes, and out-of-distribution (OOD) samples from unknown classes. In recent years, many confidence calibration and OOD detection methods have been developed. In this paper, we find a general, widely existing but actually-neglected phenomenon that most confidence estimation methods are harmful for detecting misclassification errors. We investigate this problem and reveal that popular calibration and OOD detection methods often lead to worse confidence separation between correctly classified and misclassified examples, making it difficult to decide whether to trust a prediction or not. Finally, we propose to enlarge the confidence gap by finding flat minima, which yields state-of-the-art failure prediction performance under various settings including balanced, long-tailed, and covariate-shift classification scenarios. Our study not only provides a strong baseline for reliable confidence estimation but also acts as a bridge between understanding calibration, OOD detection, and failure prediction.