LGNEMLFeb 20, 2020

Uncovering Coresets for Classification With Multi-Objective Evolutionary Algorithms

arXiv:2002.08645v15 citations
AI Analysis

This work addresses improving training speed and interpretability in machine learning, but it is incremental as it builds on existing coreset methods.

The paper tackles the problem of coreset discovery for classification by proposing a multi-objective evolutionary algorithm to optimize trade-offs between subset size and classification error, resulting in lower error and better generalization than state-of-the-art techniques on non-trivial benchmarks.

A coreset is a subset of the training set, using which a machine learning algorithm obtains performances similar to what it would deliver if trained over the whole original data. Coreset discovery is an active and open line of research as it allows improving training speed for the algorithms and may help human understanding the results. Building on previous works, a novel approach is presented: candidate corsets are iteratively optimized, adding and removing samples. As there is an obvious trade-off between limiting training size and quality of the results, a multi-objective evolutionary algorithm is used to minimize simultaneously the number of points in the set and the classification error. Experimental results on non-trivial benchmarks show that the proposed approach is able to deliver results that allow a classifier to obtain lower error and better ability of generalizing on unseen data than state-of-the-art coreset discovery techniques.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes