An Exploration into why Output Regularization Mitigates Label Noise
This provides theoretical justification for a practical method to mitigate label noise, which is an incremental contribution to the field of robust machine learning.
The paper tackles the problem of label noise in supervised learning by mathematically analyzing output regularization losses, showing that they become symmetric as regularization increases, which explains their noise robustness.
Label noise presents a real challenge for supervised learning algorithms. Consequently, mitigating label noise has attracted immense research in recent years. Noise robust losses is one of the more promising approaches for dealing with label noise, as these methods only require changing the loss function and do not require changing the design of the classifier itself, which can be expensive in terms of development time. In this work we focus on losses that use output regularization (such as label smoothing and entropy). Although these losses perform well in practice, their ability to mitigate label noise lack mathematical rigor. In this work we aim at closing this gap by showing that losses, which incorporate an output regularization term, become symmetric as the regularization coefficient goes to infinity. We argue that the regularization coefficient can be seen as a hyper-parameter controlling the symmetricity, and thus, the noise robustness of the loss function.