LGITMLSep 1, 2022

The Geometry and Calculus of Losses

arXiv:2209.00238v27 citationsh-index: 49
Originality Highly original
AI Analysis

This work provides a foundational framework for loss function design in statistical machine learning, potentially enabling tailored solutions for specific problems.

The paper tackles the problem of designing loss functions for classification and probability estimation by introducing a novel geometric perspective based on convex sets, which leads to a calculus for interpolating between losses and a theory of polar loss functions.

Statistical decision problems lie at the heart of statistical machine learning. The simplest problems are binary and multiclass classification and class probability estimation. Central to their definition is the choice of loss function, which is the means by which the quality of a solution is evaluated. In this paper we systematically develop the theory of loss functions for such problems from a novel perspective whose basic ingredients are convex sets with a particular structure. The loss function is defined as the subgradient of the support function of the convex set. It is consequently automatically proper (calibrated for probability estimation). This perspective provides three novel opportunities. It enables the development of a fundamental relationship between losses and (anti)-norms that appears to have not been noticed before. Second, it enables the development of a calculus of losses induced by the calculus of convex sets which allows the interpolation between different losses, and thus is a potential useful design tool for tailoring losses to particular problems. In doing this we build upon, and considerably extend existing results on $M$-sums of convex sets. Third, the perspective leads to a natural theory of ``polar'' loss functions, which are derived from the polar dual of the convex set defining the loss, and which form a natural universal substitution function for Vovk's aggregating algorithm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes