LG AIFeb 22

Stable Deep Reinforcement Learning via Isotropic Gaussian Representations

Ali Saheb, Johan Obando-Ceron, Aaron Courville, Pouya Bashivan, Pablo Samuel Castro

MILA

arXiv:2602.19373v12.71 citationsh-index: 78

Originality Incremental advance

AI Analysis

This addresses training instability in deep reinforcement learning for practitioners, though it appears incremental as it builds on existing representation learning approaches.

The paper tackles the problem of unstable training dynamics in deep reinforcement learning caused by non-stationarity, showing that isotropic Gaussian representations provably enable stable tracking of time-varying targets and proposing a regularization method that improves performance while reducing representation collapse and instability across various domains.

Deep reinforcement learning systems often suffer from unstable training dynamics due to non-stationarity, where learning objectives and data distributions evolve over time. We show that under non-stationary targets, isotropic Gaussian embeddings are provably advantageous. In particular, they induce stable tracking of time-varying targets for linear readouts, achieve maximal entropy under a fixed variance budget, and encourage a balanced use of all representational dimensions--all of which enable agents to be more adaptive and stable. Building on this insight, we propose the use of Sketched Isotropic Gaussian Regularization for shaping representations toward an isotropic Gaussian distribution during training. We demonstrate empirically, over a variety of domains, that this simple and computationally inexpensive method improves performance under non-stationarity while reducing representation collapse, neuron dormancy, and training instability.

View on arXiv PDF

Similar