Reset & Distill: A Recipe for Overcoming Negative Transfer in Continual Reinforcement Learning
This addresses a critical issue for continual reinforcement learning practitioners by mitigating performance degradation when tasks change, though it is incremental as it builds on existing strategies.
The paper tackles the negative transfer problem in continual reinforcement learning, where learning new tasks can degrade performance, and introduces Reset & Distill (R&D), a method that resets networks and distills knowledge to achieve significantly higher success rates on Meta World tasks.
We argue that the negative transfer problem occurring when the new task to learn arrives is an important problem that needs not be overlooked when developing effective Continual Reinforcement Learning (CRL) algorithms. Through comprehensive experimental validation, we demonstrate that such issue frequently exists in CRL and cannot be effectively addressed by several recent work on either mitigating plasticity loss of RL agents or enhancing the positive transfer in CRL scenario. To that end, we develop Reset & Distill (R&D), a simple yet highly effective baseline method, to overcome the negative transfer problem in CRL. R&D combines a strategy of resetting the agent's online actor and critic networks to learn a new task and an offline learning step for distilling the knowledge from the online actor and previous expert's action probabilities. We carried out extensive experiments on long sequence of Meta World tasks and show that our simple baseline method consistently outperforms recent approaches, achieving significantly higher success rates across a range of tasks. Our findings highlight the importance of considering negative transfer in CRL and emphasize the need for robust strategies like R&D to mitigate its detrimental effects.