LGOct 22, 2020

On the intrinsic robustness to noise of some leading classifiers and symmetric loss function -- an empirical evaluation

arXiv:2010.13570v5
Originality Synthesis-oriented
AI Analysis

This work addresses label noise issues in industrial applications, but it is incremental as it builds on existing benchmarks and methods.

The paper tackles the problem of classifier performance degradation due to noisy labels in applications like fraud detection by evaluating the intrinsic robustness of leading classifiers and symmetric loss functions on artificially corrupted datasets, finding that some algorithms naturally handle noise better.

In some industrial applications such as fraud detection, the performance of common supervision techniques may be affected by the poor quality of the available labels : in actual operational use-cases, these labels may be weak in quantity, quality or trustworthiness. We propose a benchmark to evaluate the natural robustness of different algorithms taken from various paradigms on artificially corrupted datasets, with a focus on noisy labels. This paper studies the intrinsic robustness of some leading classifiers. The algorithms under scrutiny include SVM, logistic regression, random forests, XGBoost, Khiops. Furthermore, building on results from recent literature, the study is supplemented with an investigation into the opportunity to enhance some algorithms with symmetric loss functions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes