The Easy Path to Robustness: Coreset Selection using Sample Hardness
This addresses the problem of efficiently training robust models for practitioners by focusing on data-centric approaches, though it is incremental as it builds on existing coreset selection methods.
The paper tackles the problem of improving adversarial robustness in machine learning models by proposing a coreset selection method based on sample hardness, quantified using average input gradient norm (AIGN). The result is that models trained on the selected data achieve up to 7% and 5% higher adversarial accuracy under standard and adversarial training compared to existing methods.
Designing adversarially robust models from a data-centric perspective requires understanding which input samples are most crucial for learning resilient features. While coreset selection provides a mechanism for efficient training on data subsets, current algorithms are designed for clean accuracy and fall short in preserving robustness. To address this, we propose a framework linking a sample's adversarial vulnerability to its \textit{hardness}, which we quantify using the average input gradient norm (AIGN) over training. We demonstrate that \textit{easy} samples (with low AIGN) are less vulnerable and occupy regions further from the decision boundary. Leveraging this insight, we present EasyCore, a coreset selection algorithm that retains only the samples with low AIGN for training. We empirically show that models trained on EasyCore-selected data achieve significantly higher adversarial accuracy than those trained with competing coreset methods under both standard and adversarial training. As AIGN is a model-agnostic dataset property, EasyCore is an efficient and widely applicable data-centric method for improving adversarial robustness. We show that EasyCore achieves up to 7\% and 5\% improvement in adversarial accuracy under standard training and TRADES adversarial training, respectively, compared to existing coreset methods.