LGMLJun 23, 2014

Reinforcement and Imitation Learning via Interactive No-Regret Learning

arXiv:1406.5979v1284 citations
Originality Incremental advance
AI Analysis

This work addresses a foundational issue in machine learning for researchers and practitioners, offering a unifying framework for imitation and reinforcement learning, though it appears incremental as it builds on existing interactive no-regret learning methods.

The paper tackles the problem of imitation and reinforcement learning where a learner's predictions affect the input distribution, by developing an interactive approach that leverages cost information and extends to reinforcement learning, providing theoretical support for online approximate policy iteration.

Recent work has demonstrated that problems-- particularly imitation learning and structured prediction-- where a learner's predictions influence the input-distribution it is tested on can be naturally addressed by an interactive approach and analyzed using no-regret online learning. These approaches to imitation learning, however, neither require nor benefit from information about the cost of actions. We extend existing results in two directions: first, we develop an interactive imitation learning approach that leverages cost information; second, we extend the technique to address reinforcement learning. The results provide theoretical support to the commonly observed successes of online approximate policy iteration. Our approach suggests a broad new family of algorithms and provides a unifying view of existing techniques for imitation and reinforcement learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes