LGAIMLApr 3, 2017

Multi-Advisor Reinforcement Learning

arXiv:1704.00756v224 citations
Originality Incremental advance
AI Analysis

This work addresses a specific problem in reinforcement learning for researchers, but it appears incremental as it builds on existing multi-advisor frameworks.

The paper tackles the challenge of single-agent reinforcement learning by distributing it to multiple advisors with different focuses, and introduces a novel empathic planning method to address flaws in existing approaches, achieving validation on a fruit collection task.

We consider tackling a single-agent RL problem by distributing it to $n$ learners. These learners, called advisors, endeavour to solve the problem from a different focus. Their advice, taking the form of action values, is then communicated to an aggregator, which is in control of the system. We show that the local planning method for the advisors is critical and that none of the ones found in the literature is flawless: the egocentric planning overestimates values of states where the other advisors disagree, and the agnostic planning is inefficient around danger zones. We introduce a novel approach called empathic and discuss its theoretical aspects. We empirically examine and validate our theoretical findings on a fruit collection task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes