LGCVMLJul 12, 2020

It Is Likely That Your Loss Should be a Likelihood

arXiv:2007.06059v28 citations
AI Analysis

This work addresses the problem of inflexible loss functions in machine learning, offering a probabilistic framework for practitioners to improve model robustness and calibration, though it is incremental in extending existing likelihood-based approaches.

The paper tackles the rigidity of common loss functions like mean-squared-error and cross-entropy by proposing to optimize full likelihoods with parameters such as variance and temperature, enabling adaptive tuning of loss scales and regularization. The result includes systematic evaluation for robust modeling, outlier detection, and recalibration, with methods like tuning L2 and L1 weights via scale parameters.

Many common loss functions such as mean-squared-error, cross-entropy, and reconstruction loss are unnecessarily rigid. Under a probabilistic interpretation, these common losses correspond to distributions with fixed shapes and scales. We instead argue for optimizing full likelihoods that include parameters like the normal variance and softmax temperature. Joint optimization of these "likelihood parameters" with model parameters can adaptively tune the scales and shapes of losses in addition to the strength of regularization. We explore and systematically evaluate how to parameterize and apply likelihood parameters for robust modeling, outlier-detection, and re-calibration. Additionally, we propose adaptively tuning $L_2$ and $L_1$ weights by fitting the scale parameters of normal and Laplace priors and introduce more flexible element-wise regularizers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes