CV LGAug 26, 2020

Estimating Example Difficulty Using Variance of Gradients

Chirag Agarwal, Daniel D'souza, Sara Hooker

arXiv:2008.11600v426.6140 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the need for safe model deployment and interpretability by providing an efficient method to surface challenging examples for human auditing, though it is incremental as it builds on existing gradient-based analysis.

The paper tackles the problem of identifying challenging examples for machine learning models by proposing Variance of Gradients (VoG) as a metric to rank data by difficulty, showing that high-VoG examples are more difficult to learn and improve generalization when excluded, with specific gains in out-of-distribution detection.

In machine learning, a question of great interest is understanding what examples are challenging for a model to classify. Identifying atypical examples ensures the safe deployment of models, isolates samples that require further human inspection and provides interpretability into model behavior. In this work, we propose Variance of Gradients (VoG) as a valuable and efficient metric to rank data by difficulty and to surface a tractable subset of the most challenging examples for human-in-the-loop auditing. We show that data points with high VoG scores are far more difficult for the model to learn and over-index on corrupted or memorized examples. Further, restricting the evaluation to the test set instances with the lowest VoG improves the model's generalization performance. Finally, we show that VoG is a valuable and efficient ranking for out-of-distribution detection.

View on arXiv PDF Code

Similar