MLLGSPOct 3, 2019

Robust Risk Minimization for Statistical Learning

arXiv:1910.01544v28 citations
AI Analysis

This addresses the challenge of data corruption in machine learning, which is a common issue in real-world datasets, though it appears incremental as it builds on existing robust learning frameworks.

The paper tackles the problem of statistical learning with corrupted training data by developing a robust method that only requires an upper bound on the corruption fraction, achieving state-of-the-art performance across various applications like regression and classification.

We consider a general statistical learning problem where an unknown fraction of the training data is corrupted. We develop a robust learning method that only requires specifying an upper bound on the corrupted data fraction. The method minimizes a risk function defined by a non-parametric distribution with unknown probability weights. We derive and analyse the optimal weights and show how they provide robustness against corrupted data. Furthermore, we give a computationally efficient coordinate descent algorithm to solve the risk minimization problem. We demonstrate the wide range applicability of the method, including regression, classification, unsupervised learning and classic parameter estimation, with state-of-the-art performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes