LGAISPPRAPMLJan 16, 2023

Large Deviations for Classification Performance Analysis of Machine Learning Systems

arXiv:2301.07104v13 citationsh-index: 62
Originality Synthesis-oriented
AI Analysis

This work provides theoretical insights into classification performance for researchers in machine learning, but it is incremental as it applies existing large deviations theory to a known problem.

The paper tackles the analysis of binary classification error probabilities in machine learning systems, showing that under certain conditions these errors vanish exponentially with the number of observations, and proposes two approximations for error probability curves, validated on the MNIST dataset.

We study the performance of machine learning binary classification techniques in terms of error probabilities. The statistical test is based on the Data-Driven Decision Function (D3F), learned in the training phase, i.e., what is thresholded before the final binary decision is made. Based on large deviations theory, we show that under appropriate conditions the classification error probabilities vanish exponentially, as $\sim \exp\left(-n\,I + o(n) \right)$, where $I$ is the error rate and $n$ is the number of observations available for testing. We also propose two different approximations for the error probability curves, one based on a refined asymptotic formula (often referred to as exact asymptotics), and another one based on the central limit theorem. The theoretical findings are finally tested using the popular MNIST dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes