Neuron-level Balance between Stability and Plasticity in Deep Reinforcement Learning
This addresses the challenge of continuous learning in DRL agents, offering a fine-grained approach to balance skill retention and adaptation, though it is incremental as it builds on existing network-level methods.
The paper tackles the stability-plasticity dilemma in deep reinforcement learning by proposing a neuron-level method (NBSP) that identifies skill neurons and applies gradient masking and experience replay to preserve existing skills while adapting to new tasks, showing significant performance improvements on Meta-World and Atari benchmarks.
In contrast to the human ability to continuously acquire knowledge, agents struggle with the stability-plasticity dilemma in deep reinforcement learning (DRL), which refers to the trade-off between retaining existing skills (stability) and learning new knowledge (plasticity). Current methods focus on balancing these two aspects at the network level, lacking sufficient differentiation and fine-grained control of individual neurons. To overcome this limitation, we propose Neuron-level Balance between Stability and Plasticity (NBSP) method, by taking inspiration from the observation that specific neurons are strongly relevant to task-relevant skills. Specifically, NBSP first (1) defines and identifies RL skill neurons that are crucial for knowledge retention through a goal-oriented method, and then (2) introduces a framework by employing gradient masking and experience replay techniques targeting these neurons to preserve the encoded existing skills while enabling adaptation to new tasks. Numerous experimental results on the Meta-World and Atari benchmarks demonstrate that NBSP significantly outperforms existing approaches in balancing stability and plasticity.