MLOct 13, 2017

Unsupervised Real-Time Control through Variational Empowerment

Maximilian Karl, Maximilian Soelch, Philip Becker-Ehmck, Djalel Benbouzid, Patrick van der Smagt, Justin Bayer

arXiv:1710.05101v117.660 citations

Originality Highly original

AI Analysis

This addresses the challenge of applying empowerment in nonlinear continuous spaces for unsupervised policy learning, which is incremental as it builds on prior work but improves computational feasibility.

The paper tackles the problem of efficiently computing empowerment for real-time control by introducing a method to compute a lower bound, enabling its use as an unsupervised cost function in continuous dynamical systems, resulting in policies that reliably drive agents into states where they can use their full potential.

We introduce a methodology for efficiently computing a lower bound to empowerment, allowing it to be used as an unsupervised cost function for policy learning in real-time control. Empowerment, being the channel capacity between actions and states, maximises the influence of an agent on its near future. It has been shown to be a good model of biological behaviour in the absence of an extrinsic goal. But empowerment is also prohibitively hard to compute, especially in nonlinear continuous spaces. We introduce an efficient, amortised method for learning empowerment-maximising policies. We demonstrate that our algorithm can reliably handle continuous dynamical systems using system dynamics learned from raw data. The resulting policies consistently drive the agents into states where they can use their full potential.

View on arXiv PDF

Similar