Data Attribution in Adaptive Learning
This addresses the challenge of data attribution for researchers and practitioners in machine learning, particularly in adaptive systems like reinforcement learning, but it is incremental as it builds on existing attribution methods by extending them to dynamic contexts.
The paper tackled the problem of attributing model predictions to training data in adaptive learning settings where data distribution shifts over time, and they formalized a conditional interventional target for occurrence-level attribution and identified conditions under which it can be recovered from logged data.
Machine learning models increasingly generate their own training data -- online bandits, reinforcement learning, and post-training pipelines for language models are leading examples. In these adaptive settings, a single training observation both updates the learner and shifts the distribution of future data the learner will collect. Standard attribution methods, designed for static datasets, ignore this feedback. We formalize occurrence-level attribution for finite-horizon adaptive learning via a conditional interventional target, prove that replay-side information cannot recover it in general, and identify a structural class in which the target is identified from logged data.