Decoupling Dynamics and Reward for Transfer Learning
This work addresses generalization issues in reinforcement learning for AI systems, though it appears incremental as it builds on existing decoupling concepts.
The paper tackles the poor generalization of reinforcement learning methods to task perturbations by proposing a decoupled learning strategy that separates task representation, dynamics, and reward functions, resulting in improved performance and robust transfer to changes in dynamics and reward.
Current reinforcement learning (RL) methods can successfully learn single tasks but often generalize poorly to modest perturbations in task domain or training procedure. In this work, we present a decoupled learning strategy for RL that creates a shared representation space where knowledge can be robustly transferred. We separate learning the task representation, the forward dynamics, the inverse dynamics and the reward function of the domain, and show that this decoupling improves performance within the task, transfers well to changes in dynamics and reward, and can be effectively used for online planning. Empirical results show good performance in both continuous and discrete RL domains.