The Role of Exploration for Task Transfer in Reinforcement Learning
This work addresses the problem of efficient task adaptation in reinforcement learning for researchers, but it is incremental as it reviews and analyzes existing methods without introducing new techniques.
The paper re-examines the exploration-exploitation trade-off in reinforcement learning for online task transfer, hypothesizing that exploration strategies anticipating future tasks can improve transfer efficiency, and reviews existing methods, defines a taxonomy, and suggests future research directions.
The exploration--exploitation trade-off in reinforcement learning (RL) is a well-known and much-studied problem that balances greedy action selection with novel experience, and the study of exploration methods is usually only considered in the context of learning the optimal policy for a single learning task. However, in the context of online task transfer, where there is a change to the task during online operation, we hypothesize that exploration strategies that anticipate the need to adapt to future tasks can have a pronounced impact on the efficiency of transfer. As such, we re-examine the exploration--exploitation trade-off in the context of transfer learning. In this work, we review reinforcement learning exploration methods, define a taxonomy with which to organize them, analyze these methods' differences in the context of task transfer, and suggest avenues for future investigation.