LGMLMay 24, 2019

Learning Surrogate Losses

arXiv:1905.10108v142 citations
Originality Incremental advance
AI Analysis

This addresses a fundamental optimization challenge in machine learning for practitioners dealing with complex real-world metrics, though it is an incremental improvement over existing methods.

The paper tackles the problem of minimizing non-differentiable and non-decomposable loss functions like Miss-classification Rate and AUC by learning smooth surrogate losses via neural networks, achieving efficiency in empirical results across multiple datasets compared to state-of-the-art baselines.

The minimization of loss functions is the heart and soul of Machine Learning. In this paper, we propose an off-the-shelf optimization approach that can minimize virtually any non-differentiable and non-decomposable loss function (e.g. Miss-classification Rate, AUC, F1, Jaccard Index, Mathew Correlation Coefficient, etc.) seamlessly. Our strategy learns smooth relaxation versions of the true losses by approximating them through a surrogate neural network. The proposed loss networks are set-wise models which are invariant to the order of mini-batch instances. Ultimately, the surrogate losses are learned jointly with the prediction model via bilevel optimization. Empirical results on multiple datasets with diverse real-life loss functions compared with state-of-the-art baselines demonstrate the efficiency of learning surrogate losses.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes