LGOct 14, 2025

Influence Dynamics and Stagewise Data Attribution

arXiv:2510.12071v1h-index: 2
Originality Incremental advance
AI Analysis

This addresses the problem of understanding dynamic learning processes in neural networks for researchers in machine learning theory and interpretability, though it appears incremental as it builds on existing attribution methods by adding a temporal dimension.

The paper tackles the problem that current training data attribution methods treat influence between samples as static, when neural networks actually learn in distinct stages with changing influence patterns, and demonstrates that influence can change non-monotonically with sign flips and peaks at transitions, validated analytically and empirically in toy models and language models.

Current training data attribution (TDA) methods treat the influence one sample has on another as static, but neural networks learn in distinct stages that exhibit changing patterns of influence. In this work, we introduce a framework for stagewise data attribution grounded in singular learning theory. We predict that influence can change non-monotonically, including sign flips and sharp peaks at developmental transitions. We first validate these predictions analytically and empirically in a toy model, showing that dynamic shifts in influence directly map to the model's progressive learning of a semantic hierarchy. Finally, we demonstrate these phenomena at scale in language models, where token-level influence changes align with known developmental stages.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes