LGMLDec 5, 2019

Screening Data Points in Empirical Risk Minimization via Ellipsoidal Regions and Safe Loss Functions

arXiv:1912.02566v31 citations
Originality Synthesis-oriented
AI Analysis

This work addresses computational efficiency for machine learning practitioners, but it appears incremental as it builds on existing screening and regularization methods.

The paper tackles the problem of reducing computational cost in empirical risk minimization by automatically discarding data samples without losing optimization guarantees, achieving dataset compression into a subset of representative points.

We design simple screening tests to automatically discard data samples in empirical risk minimization without losing optimization guarantees. We derive loss functions that produce dual objectives with a sparse solution. We also show how to regularize convex losses to ensure such a dual sparsity-inducing property, and propose a general method to design screening tests for classification or regression based on ellipsoidal approximations of the optimal set. In addition to producing computational gains, our approach also allows us to compress a dataset into a subset of representative points.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes