LGMLApr 4

A Bayesian Information-Theoretic Approach to Data Attribution

arXiv:2604.0385837.9h-index: 2
Predicted impact top 65% in LG · last 90 daysOriginality Incremental advance
AI Analysis

Provides a principled, scalable framework for data attribution in deep learning, bridging information theory with practical methods.

The paper formulates Training Data Attribution as a Bayesian information-theoretic problem, scoring subsets by the entropy increase they induce when removed. The method scales to modern networks via a Gaussian Process surrogate and achieves competitive performance on counterfactual sensitivity, ground-truth retrieval, and coreset selection.

Training Data Attribution (TDA) seeks to trace model predictions back to influential training examples, enhancing interpretability and safety. We formulate TDA as a Bayesian information-theoretic problem: subsets are scored by the information loss they induce - the entropy increase at a query when removed. This criterion credits examples for resolving predictive uncertainty rather than label noise. To scale to modern networks, we approximate information loss using a Gaussian Process surrogate built from tangent features. We show this aligns with classical influence scores for single-example attribution while promoting diversity for subsets. For even larger-scale retrieval, we relax to an information-gain objective and add a variance correction for scalable attribution in vector databases. Experiments show competitive performance on counterfactual sensitivity, ground-truth retrieval and coreset selection, showing that our method scales to modern architectures while bridging principled measures with practice.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes