AIMay 11, 2018

Interactive Reinforcement Learning with Dynamic Reuse of Prior Knowledge from Human/Agent's Demonstration

arXiv:1805.04493v18.59 citations

Originality Incremental advance

AI Analysis

This addresses the data inefficiency problem in reinforcement learning for AI agents, though it appears incremental as it builds on prior transfer methods like HAT and CHAT.

The paper tackles the problem of reinforcement learning requiring large amounts of data by leveraging demonstrations from humans or agents to enable untrained agents to achieve high performance quickly, introducing DRoP as an effective transfer approach that dynamically integrates offline knowledge with online learning.

Reinforcement learning has enjoyed multiple successes in recent years. However, these successes typically require very large amounts of data before an agent achieves acceptable performance. This paper introduces a novel way of combating such requirements by leveraging existing (human or agent) knowledge. In particular, this paper uses demonstrations from agents and humans, allowing an untrained agent to quickly achieve high performance. We empirically compare with, and highlight the weakness of, HAT and CHAT, methods of transferring knowledge from a source agent/human to a target agent. This paper introduces an effective transfer approach, DRoP, combining the offline knowledge (demonstrations recorded before learning) with online confidence-based performance analysis. DRoP dynamically involves the demonstrator's knowledge, integrating it into the reinforcement learning agent's online learning loop to achieve efficient and robust learning.

View on arXiv PDF

Similar