Task Relabelling for Multi-task Transfer using Successor Features
This addresses the problem of inflexible policies in reinforcement learning for researchers and practitioners, but it is incremental as it builds on existing SF mechanisms.
The paper tackled the problem of enabling deep reinforcement learning agents to adapt to new tasks without retraining by using Successor Features (SFs) pre-trained without rewards in a custom environment with resource collection, traps, and crafting. The result showed that a task relabelling method improved agent performance in transferring to various target tasks using only a task vector, with concrete gains in transfer efficiency.
Deep Reinforcement Learning has been very successful recently with various works on complex domains. Most works are concerned with learning a single policy that solves the target task, but is fixed in the sense that if the environment changes the agent is unable to adapt to it. Successor Features (SFs) proposes a mechanism that allows learning policies that are not tied to any particular reward function. In this work we investigate how SFs may be pre-trained without observing any reward in a custom environment that features resource collection, traps and crafting. After pre-training we expose the SF agents to various target tasks and see how well they can transfer to new tasks. Transferring is done without any further training on the SF agents, instead just by providing a task vector. For training the SFs we propose a task relabelling method which greatly improves the agent's performance.