LGITOct 28, 2022

The Fisher-Rao Loss for Learning under Label Noise

arXiv:2210.16401v26 citationsh-index: 3
AI Analysis

This work addresses the challenge of label noise in training datasets for machine learning practitioners, presenting an incremental improvement by introducing a new loss function based on statistical geometry.

The paper tackles the problem of learning classifiers from datasets with incorrect labels by proposing the Fisher-Rao loss function, which is derived from the Fisher-Rao distance in statistical manifolds. The result shows that this loss offers a natural trade-off between robustness to label noise and training dynamics, as demonstrated through numerical experiments on synthetic and MNIST datasets.

Choosing a suitable loss function is essential when learning by empirical risk minimisation. In many practical cases, the datasets used for training a classifier may contain incorrect labels, which prompts the interest for using loss functions that are inherently robust to label noise. In this paper, we study the Fisher-Rao loss function, which emerges from the Fisher-Rao distance in the statistical manifold of discrete distributions. We derive an upper bound for the performance degradation in the presence of label noise, and analyse the learning speed of this loss. Comparing with other commonly used losses, we argue that the Fisher-Rao loss provides a natural trade-off between robustness and training dynamics. Numerical experiments with synthetic and MNIST datasets illustrate this performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes