LGMay 2

Barriers to Counterfactual Credit Attribution for Autoregressive Models

arXiv:2605.0142528.2h-index: 2
AI Analysis

For researchers and practitioners building generative AI systems, this work identifies theoretical limitations in achieving fair credit attribution, suggesting that current approaches are either insufficient or computationally infeasible.

The paper studies counterfactual credit attribution (CCA) for autoregressive generative models, showing that CCA does not compose autoregressively and that retrofitting CCA requires exponential query complexity. These results reveal fundamental barriers to implementing credit attribution in such models.

Generative AI disrupts the practice of giving credit to work that came before. Ideally, a generative model would give credit to any work on which its output depends in a significant way. \emph{Counterfactual credit attribution} (CCA) is a technical condition formalizing this goal--a relaxation of differential privacy--recently introduced by Livni, Moran, Nissim, and Pabbaraju [2024] who studied it in the PAC learning setting. We initiate the study of CCA generative models. Specifically, we consider autoregressive models giving credit to a deployment-time dataset (e.g., a RAG database). We uncover barriers to two natural approaches to CCA autoregressive models. First, we show that imposing CCA on the underlying next-token predictor does not guarantee that the model is CCA: CCA does not compose autoregressively (unlike DP). Second, we consider a different approach to building CCA models which we call \emph{retrofitting}. Retrofitting takes a model that does not attribute credit, and adds credit onto it. We prove a lower bound for CCA retrofitting under a weak optimality requirement. Given black-box access to the starting model, retrofitting requires query complexity exponential in the length of the model's outputs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes