LGAIMLMay 24, 2019

InfoRL: Interpretable Reinforcement Learning using Information Maximization

arXiv:1905.10404v15 citations
Originality Incremental advance
AI Analysis

This addresses the need for interpretable and diverse policies in complex environments, though it appears incremental as it builds on existing RL methods.

The paper tackles the problem of learning multiple distinct policies for a single task in reinforcement learning, and demonstrates that using information maximization can discover latent codes representing different ways to perform tasks.

Recent advances in reinforcement learning have proved that given an environment we can learn to perform a task in that environment if we have access to some form of a reward function (dense, sparse or derived from IRL). But most of the algorithms focus on learning a single best policy to perform a given set of tasks. In this paper, we focus on an algorithm that learns to not just perform a task but different ways to perform the same task. As we know when the environment is complex enough there always exists multiple ways to perform a task. We show that using the concept of information maximization it is possible to learn latent codes for discovering multiple ways to perform any given task in an environment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes