Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
This addresses the challenge of multitask and transfer learning for autonomous agents, though it is incremental as it builds on existing deep reinforcement learning and model compression techniques.
The paper tackles the problem of enabling an agent to learn and transfer knowledge across multiple tasks by introducing Actor-Mimic, a method that trains a single policy network using expert guidance from distinct tasks, resulting in faster learning in new environments without prior guidance.
The ability to act in multiple environments and transfer previous knowledge to new situations can be considered a critical aspect of any intelligent agent. Towards this goal, we define a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains. This method, termed "Actor-Mimic", exploits the use of deep reinforcement learning and model compression techniques to train a single policy network that learns how to act in a set of distinct tasks by using the guidance of several expert teachers. We then show that the representations learnt by the deep policy network are capable of generalizing to new tasks with no prior expert guidance, speeding up learning in novel environments. Although our method can in general be applied to a wide range of problems, we use Atari games as a testing environment to demonstrate these methods.