학술논문

PACMAN: PAC-style bounds accounting for the Mismatch between Accuracy and Negative log-loss.

Document Type

Article

Author

Vera, Matias; Vega, Leonardo Rey; Piantanida, Pablo

Source

Information & Inference: A Journal of the IMA. Mar2024, Vol. 13 Issue 1, p1-29. 29p.

Subject

*CLASSIFICATION algorithms
*MACHINE learning
*ERROR probability
*MACHINE performance
*MACHINE theory
*GENERALIZATION

Language

ISSN

2049-8764

Abstract

The ultimate performance of machine learning algorithms for classification tasks is usually measured in terms of the empirical error probability (or accuracy) using a testing dataset, although these algorithms are optimized through the minimization of a typically different—more convenient—loss function using a training set. For classification tasks, this loss function is often the negative log-loss that yields the well-known cross-entropy risk that is typically better behaved (in terms of numerical behavior) than the 0-1 loss. Conventional studies on the generalization error do not usually take into account the underlying mismatch between losses at training and testing phases. In this work, we introduce a theoretical analysis based on a pointwise probably approximately correct (PAC) approach over the generalization gap considering the mismatch of testing on the accuracy metric and training on the negative log-loss, referred to as PACMAN. Building on the fact that the resulting mismatch can be written as a likelihood ratio, concentration inequalities can be used to obtain insights into the generalization gap in terms of PAC bounds, which depend on some meaningful information-theoretic quantities. An analysis of the obtained bounds and a comparison with available results in the literature is also provided. [ABSTRACT FROM AUTHOR]

Online Access

Full Text (Oxford University Press) Web of Science JCR 저널정보 Find it@PNU

이메일

부산대학교 도서관

Online Access

메일 발송