LG MLApr 4

A Bayesian Information-Theoretic Approach to Data Attribution

Dharmesh Tailor, Nicolò Felicioni, Kamil Ciosek

arXiv:2604.0385837.9h-index: 2

Predicted impact top 65% in LG · last 90 daysOriginality Incremental advance

AI Analysis

Provides a principled, scalable framework for data attribution in deep learning, bridging information theory with practical methods.

The paper formulates Training Data Attribution as a Bayesian information-theoretic problem, scoring subsets by the entropy increase they induce when removed. The method scales to modern networks via a Gaussian Process surrogate and achieves competitive performance on counterfactual sensitivity, ground-truth retrieval, and coreset selection.

Training Data Attribution (TDA) seeks to trace model predictions back to influential training examples, enhancing interpretability and safety. We formulate TDA as a Bayesian information-theoretic problem: subsets are scored by the information loss they induce - the entropy increase at a query when removed. This criterion credits examples for resolving predictive uncertainty rather than label noise. To scale to modern networks, we approximate information loss using a Gaussian Process surrogate built from tangent features. We show this aligns with classical influence scores for single-example attribution while promoting diversity for subsets. For even larger-scale retrieval, we relax to an information-gain objective and add a variance correction for scalable attribution in vector databases. Experiments show competitive performance on counterfactual sensitivity, ground-truth retrieval and coreset selection, showing that our method scales to modern architectures while bridging principled measures with practice.

View on arXiv PDF

Similar