LGMLJun 8, 2019

Robust Bi-Tempered Logistic Loss Based on Bregman Divergences

arXiv:1906.03361v3142 citations
AI Analysis

This work addresses robustness to noise in neural network training, but it is incremental as it builds on existing temperature-based methods.

The authors tackled the problem of making neural network training more robust to noise by introducing a bi-tempered logistic loss based on Bregman divergences, which replaces the softmax output layer and log loss with temperature-tuned generalizations, showing efficacy on large datasets.

We introduce a temperature into the exponential function and replace the softmax output layer of neural nets by a high temperature generalization. Similarly, the logarithm in the log loss we use for training is replaced by a low temperature logarithm. By tuning the two temperatures we create loss functions that are non-convex already in the single layer case. When replacing the last layer of the neural nets by our bi-temperature generalization of logistic loss, the training becomes more robust to noise. We visualize the effect of tuning the two temperatures in a simple setting and show the efficacy of our method on large data sets. Our methodology is based on Bregman divergences and is superior to a related two-temperature method using the Tsallis divergence.

Code Implementations11 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes