Modular Continual Learning in a Unified Visual Environment
This work addresses the challenge of continual learning for AI systems, offering incremental improvements in efficiency for task-switching in reinforcement learning.
The paper tackles the problem of enabling agents to learn and switch between multiple tasks efficiently in a unified visual environment, showing that a modular architecture with specific design principles outperforms standard neural networks by requiring fewer training examples and neurons for high performance.
A core aspect of human intelligence is the ability to learn new tasks quickly and switch between them flexibly. Here, we describe a modular continual reinforcement learning paradigm inspired by these abilities. We first introduce a visual interaction environment that allows many types of tasks to be unified in a single framework. We then describe a reward map prediction scheme that learns new tasks robustly in the very large state and action spaces required by such an environment. We investigate how properties of module architecture influence efficiency of task learning, showing that a module motif incorporating specific design principles (e.g. early bottlenecks, low-order polynomial nonlinearities, and symmetry) significantly outperforms more standard neural network motifs, needing fewer training examples and fewer neurons to achieve high levels of performance. Finally, we present a meta-controller architecture for task switching based on a dynamic neural voting scheme, which allows new modules to use information learned from previously-seen tasks to substantially improve their own learning efficiency.