LGMLJan 16, 2020

MIME: Mutual Information Minimisation Exploration

arXiv:2001.05636v11 citations
AI Analysis

This addresses a key bottleneck in reinforcement learning for agents in environments with sharp transitions, offering a novel solution with broad applicability.

The paper tackles the problem of reinforcement learning agents getting stuck at abrupt environmental transitions by proposing Mutual Information Minimizing Exploration (MIME), which learns latent representations without predicting future states. It shows significantly better performance on sharp boundaries and achieves state-of-the-art results on games like Gravitar, Montezuma's Revenge, and Doom.

We show that reinforcement learning agents that learn by surprise (surprisal) get stuck at abrupt environmental transition boundaries because these transitions are difficult to learn. We propose a counter-intuitive solution that we call Mutual Information Minimising Exploration (MIME) where an agent learns a latent representation of the environment without trying to predict the future states. We show that our agent performs significantly better over sharp transition boundaries while matching the performance of surprisal driven agents elsewhere. In particular, we show state-of-the-art performance on difficult learning games such as Gravitar, Montezuma's Revenge and Doom.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes