LGAIMay 6, 2024

Outlier Gradient Analysis: Efficiently Identifying Detrimental Training Samples for Deep Learning Models

arXiv:2405.03869v79 citationsICML
Originality Incremental advance
AI Analysis

This work addresses a data-centric challenge for deep learning practitioners by providing a more efficient alternative to influence functions, though it is incremental as it builds on existing outlier detection concepts.

The paper tackles the high computational cost of influence functions for identifying detrimental training samples by proposing outlier gradient analysis, a Hessian-free method that reduces computational overhead while effectively detecting mislabeled samples and selecting data for performance improvement in vision and NLP models.

A core data-centric learning challenge is the identification of training samples that are detrimental to model performance. Influence functions serve as a prominent tool for this task and offer a robust framework for assessing training data influence on model predictions. Despite their widespread use, their high computational cost associated with calculating the inverse of the Hessian matrix pose constraints, particularly when analyzing large-sized deep models. In this paper, we establish a bridge between identifying detrimental training samples via influence functions and outlier gradient detection. This transformation not only presents a straightforward and Hessian-free formulation but also provides insights into the role of the gradient in sample impact. Through systematic empirical evaluations, we first validate the hypothesis of our proposed outlier gradient analysis approach on synthetic datasets. We then demonstrate its effectiveness in detecting mislabeled samples in vision models and selecting data samples for improving performance of natural language processing transformer models. We also extend its use to influential sample identification for fine-tuning Large Language Models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes