LGCVOct 13, 2025

The Easy Path to Robustness: Coreset Selection using Sample Hardness

arXiv:2510.11018v1h-index: 20
Originality Incremental advance
AI Analysis

This addresses the problem of efficiently training robust models for practitioners by focusing on data-centric approaches, though it is incremental as it builds on existing coreset selection methods.

The paper tackles the problem of improving adversarial robustness in machine learning models by proposing a coreset selection method based on sample hardness, quantified using average input gradient norm (AIGN). The result is that models trained on the selected data achieve up to 7% and 5% higher adversarial accuracy under standard and adversarial training compared to existing methods.

Designing adversarially robust models from a data-centric perspective requires understanding which input samples are most crucial for learning resilient features. While coreset selection provides a mechanism for efficient training on data subsets, current algorithms are designed for clean accuracy and fall short in preserving robustness. To address this, we propose a framework linking a sample's adversarial vulnerability to its \textit{hardness}, which we quantify using the average input gradient norm (AIGN) over training. We demonstrate that \textit{easy} samples (with low AIGN) are less vulnerable and occupy regions further from the decision boundary. Leveraging this insight, we present EasyCore, a coreset selection algorithm that retains only the samples with low AIGN for training. We empirically show that models trained on EasyCore-selected data achieve significantly higher adversarial accuracy than those trained with competing coreset methods under both standard and adversarial training. As AIGN is a model-agnostic dataset property, EasyCore is an efficient and widely applicable data-centric method for improving adversarial robustness. We show that EasyCore achieves up to 7\% and 5\% improvement in adversarial accuracy under standard training and TRADES adversarial training, respectively, compared to existing coreset methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes