Improving generalization to new environments and removing catastrophic forgetting in Reinforcement Learning by using an eco-system of agents
This addresses generalization and forgetting issues in RL, which are critical for deploying agents in varied real-world settings, though it appears incremental as it builds on existing multi-agent concepts.
The paper tackles the problem of RL agents overfitting to training environments and suffering from catastrophic forgetting when adapting to new ones, proposing an eco-system of agents that improves generalization and prevents forgetting.
Adapting a Reinforcement Learning (RL) agent to an unseen environment is a difficult task due to typical over-fitting on the training environment. RL agents are often capable of solving environments very close to the trained environment, but when environments become substantially different, their performance quickly drops. When agents are retrained on new environments, a second issue arises: there is a risk of catastrophic forgetting, where the performance on previously seen environments is seriously hampered. This paper proposes a novel approach that exploits an eco-system of agents to address both concerns. Hereby, the (limited) adaptive power of individual agents is harvested to build a highly adaptive eco-system.