LG MLJan 16, 2020

MIME: Mutual Information Minimisation Exploration

Haitao Xu, Brendan McCane, Lech Szymanski, Craig Atkinson

arXiv:2001.05636v12.31 citations

Originality Highly original

AI Analysis

This addresses a key bottleneck in reinforcement learning for agents in environments with sharp transitions, offering a novel solution with broad applicability.

The paper tackles the problem of reinforcement learning agents getting stuck at abrupt environmental transitions by proposing Mutual Information Minimizing Exploration (MIME), which learns latent representations without predicting future states. It shows significantly better performance on sharp boundaries and achieves state-of-the-art results on games like Gravitar, Montezuma's Revenge, and Doom.

We show that reinforcement learning agents that learn by surprise (surprisal) get stuck at abrupt environmental transition boundaries because these transitions are difficult to learn. We propose a counter-intuitive solution that we call Mutual Information Minimising Exploration (MIME) where an agent learns a latent representation of the environment without trying to predict the future states. We show that our agent performs significantly better over sharp transition boundaries while matching the performance of surprisal driven agents elsewhere. In particular, we show state-of-the-art performance on difficult learning games such as Gravitar, Montezuma's Revenge and Doom.

View on arXiv PDF

Similar