Measuring Learning Progress via Gradient-Momentum Coupling
For reinforcement learning agents, GMC provides a more robust curiosity signal than prediction error, addressing the problem of noise sensitivity in exploration.
Gradient-Momentum Coupling (GMC) is proposed as a learning progress signal for curiosity-driven exploration that distinguishes learnable patterns from noise by measuring gradient-momentum alignment. Experiments show noise robustness and emergent curriculum learning, improving robustness to observation noise in MiniGrid.
Measuring learning progress is essential for curiosity-driven exploration in reinforcement learning, but widely used signals such as prediction error often fail to distinguish meaningful, learnable patterns from random noise. This paper proposes Gradient-Momentum Coupling (GMC), a signal derived from optimization dynamics that quantifies how useful each sample's gradient is for ongoing learning by measuring its per-parameter normalized absolute product with the momentum from previous gradients. By leveraging momentum's natural filtering of noise and oscillations, GMC identifies samples that contribute to ongoing parameter updates. Controlled experiments demonstrate noise robustness and emergent curriculum learning, with the signal prioritizing tasks by learning speed rather than difficulty. Experiments on MiniGrid suggest that replacing prediction error with GMC within existing curiosity-driven architectures can improve robustness to observation noise.