LGMLDec 10, 2021

PACMAN: PAC-style bounds accounting for the Mismatch between Accuracy and Negative log-loss

arXiv:2112.05547v1
Originality Incremental advance
AI Analysis

This work addresses a fundamental issue in machine learning generalization theory for researchers, though it is incremental as it builds on existing PAC frameworks.

The paper tackles the mismatch between training loss (negative log-loss) and testing metric (accuracy) in classification, deriving PAC-style bounds that incorporate this discrepancy and linking them to information-theoretic quantities.

The ultimate performance of machine learning algorithms for classification tasks is usually measured in terms of the empirical error probability (or accuracy) based on a testing dataset. Whereas, these algorithms are optimized through the minimization of a typically different--more convenient--loss function based on a training set. For classification tasks, this loss function is often the negative log-loss that leads to the well-known cross-entropy risk which is typically better behaved (from a numerical perspective) than the error probability. Conventional studies on the generalization error do not usually take into account the underlying mismatch between losses at training and testing phases. In this work, we introduce an analysis based on point-wise PAC approach over the generalization gap considering the mismatch of testing based on the accuracy metric and training on the negative log-loss. We label this analysis PACMAN. Building on the fact that the mentioned mismatch can be written as a likelihood ratio, concentration inequalities can be used to provide some insights for the generalization problem in terms of some point-wise PAC bounds depending on some meaningful information-theoretic quantities. An analysis of the obtained bounds and a comparison with available results in the literature are also provided.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes