LGOct 22, 2020

On the intrinsic robustness to noise of some leading classifiers and symmetric loss function -- an empirical evaluation

Hugo Le Baher, Vincent Lemaire, Romain Trinquart

arXiv:2010.13570v51.2

Originality Synthesis-oriented

AI Analysis

This work addresses label noise issues in industrial applications, but it is incremental as it builds on existing benchmarks and methods.

The paper tackles the problem of classifier performance degradation due to noisy labels in applications like fraud detection by evaluating the intrinsic robustness of leading classifiers and symmetric loss functions on artificially corrupted datasets, finding that some algorithms naturally handle noise better.

In some industrial applications such as fraud detection, the performance of common supervision techniques may be affected by the poor quality of the available labels : in actual operational use-cases, these labels may be weak in quantity, quality or trustworthiness. We propose a benchmark to evaluate the natural robustness of different algorithms taken from various paradigms on artificially corrupted datasets, with a focus on noisy labels. This paper studies the intrinsic robustness of some leading classifiers. The algorithms under scrutiny include SVM, logistic regression, random forests, XGBoost, Khiops. Furthermore, building on results from recent literature, the study is supplemented with an investigation into the opportunity to enhance some algorithms with symmetric loss functions.

View on arXiv PDF

Similar