LGAIOct 30, 2025

Faithful and Fast Influence Function via Advanced Sampling

arXiv:2510.26776v22 citationsh-index: 7
Originality Incremental advance
AI Analysis

This work addresses the computational bottleneck in influence function estimation for machine learning practitioners, offering a more efficient and consistent method, though it is incremental as it builds on existing sampling approaches.

The paper tackles the problem of efficiently and accurately estimating influence functions for black-box models by proposing advanced sampling techniques based on features and logits, which reduce computation time by 30.1% and memory usage by 42.2% or improve F1-score by 2.5% compared to baselines.

How can we explain the influence of training data on black-box models? Influence functions (IFs) offer a post-hoc solution by utilizing gradients and Hessians. However, computing the Hessian for an entire dataset is resource-intensive, necessitating a feasible alternative. A common approach involves randomly sampling a small subset of the training data, but this method often results in highly inconsistent IF estimates due to the high variance in sample configurations. To address this, we propose two advanced sampling techniques based on features and logits. These samplers select a small yet representative subset of the entire dataset by considering the stochastic distribution of features or logits, thereby enhancing the accuracy of IF estimations. We validate our approach through class removal experiments, a typical application of IFs, using the F1-score to measure how effectively the model forgets the removed class while maintaining inference consistency on the remaining classes. Our method reduces computation time by 30.1% and memory usage by 42.2%, or improves the F1-score by 2.5% compared to the baseline.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes