LGNov 25, 2025

Scalable Data Attribution via Forward-Only Test-Time Inference

arXiv:2511.19803v1
Originality Incremental advance
AI Analysis

This addresses the need for scalable debugging, auditing, and data valuation in large models like LLMs, though it is incremental as it builds on classical influence-function methods.

The paper tackles the problem of data attribution in modern networks by proposing a method that eliminates expensive backpropagation at inference, achieving state-of-the-art performance on standard benchmarks with orders-of-magnitude lower inference cost.

Data attribution seeks to trace model behavior back to the training examples that shaped it, enabling debugging, auditing, and data valuation at scale. Classical influence-function methods offer a principled foundation but remain impractical for modern networks because they require expensive backpropagation or Hessian inversion at inference. We propose a data attribution method that preserves the same first-order counterfactual target while eliminating per-query backward passes. Our approach simulates each training example's parameter influence through short-horizon gradient propagation during training and later reads out attributions for any query using only forward evaluations. This design shifts computation from inference to simulation, reflecting real deployment regimes where a model may serve billions of user queries but originate from a fixed, finite set of data sources (for example, a large language model trained on diverse corpora while compensating a specific publisher such as the New York Times). Empirically, on standard MLP benchmarks, our estimator matches or surpasses state-of-the-art baselines such as TRAK on standard attribution metrics (LOO and LDS) while offering orders-of-magnitude lower inference cost. By combining influence-function fidelity with first-order scalability, our method provides a theoretical framework for practical, real-time data attribution in large pretrained models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes