LGAIJun 3

Trace-Mediated Peak Bias: Bridging Temporal Credit Assignment and Cognitive Heuristics in Deep Reinforcement Learning

arXiv:2606.0473546.6
AI Analysis

For deep reinforcement learning researchers, this work reveals a fundamental bias in temporal credit assignment that links to cognitive heuristics, though the findings are incremental as they extend known issues with eligibility traces.

The paper identifies a systematic failure mode in deep reinforcement learning called Trace-Mediated Peak Bias (TMPB), where agents prefer high-magnitude reward peaks over higher cumulative returns, providing a mechanistic account of the Peak-End Rule. It shows that adaptive optimizers mitigate this pathology via second-moment normalization.

Temporal credit assignment is central to both biological and artificial intelligence, yet its interaction with non-linear function approximation is poorly understood. We identify a systematic failure mode in deep reinforcement learning (RL) termed Trace-Mediated Peak Bias (TMPB). At intermediate eligibility trace depths, agents irrationally prefer trajectories with high-magnitude reward ``peaks'' over alternatives with higher cumulative returns. This provides a mechanistic account of the Peak-End Rule: a human memory bias where experiences are judged by their most intense moments rather than integrated utility. We show that TMPB emerges because traces amplify distal Temporal Difference errors into ``gradient shocks'' that fixed-step-size Stochastic Gradient Descent cannot normalize, leading to global overestimation. Conversely, adaptive optimizers mitigate this pathology via second-moment normalization. Our results suggest that human-like saliency distortions may emerge naturally from the mathematical constraints of credit assignment in distributed systems, and that adaptive optimization is a theoretical necessity for rational value estimation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes