LGAIOct 15, 2020

Knowledge Transfer in Multi-Task Deep Reinforcement Learning for Continuous Control

arXiv:2010.07494v256 citations
Originality Incremental advance
AI Analysis

This addresses the problem of multi-task efficiency in continuous control for robotics and AI applications, though it appears incremental as it builds on existing actor-critic architectures and knowledge transfer concepts.

The paper tackles the challenge of training a single deep reinforcement learning agent for multiple continuous control tasks by introducing a knowledge transfer framework (KTM-DRL) that learns from task-specific teachers, achieving expert-level performance and outperforming state-of-the-art methods by a large margin.

While Deep Reinforcement Learning (DRL) has emerged as a promising approach to many complex tasks, it remains challenging to train a single DRL agent that is capable of undertaking multiple different continuous control tasks. In this paper, we present a Knowledge Transfer based Multi-task Deep Reinforcement Learning framework (KTM-DRL) for continuous control, which enables a single DRL agent to achieve expert-level performance in multiple different tasks by learning from task-specific teachers. In KTM-DRL, the multi-task agent first leverages an offline knowledge transfer algorithm designed particularly for the actor-critic architecture to quickly learn a control policy from the experience of task-specific teachers, and then it employs an online learning algorithm to further improve itself by learning from new online transition samples under the guidance of those teachers. We perform a comprehensive empirical study with two commonly-used benchmarks in the MuJoCo continuous control task suite. The experimental results well justify the effectiveness of KTM-DRL and its knowledge transfer and online learning algorithms, as well as its superiority over the state-of-the-art by a large margin.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes