Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images
This work addresses the challenge of model-based control from visual inputs for robotics or AI systems, representing an incremental improvement by applying variational autoencoders with constrained dynamics.
The authors tackled the problem of controlling non-linear dynamical systems from raw pixel images by introducing Embed to Control (E2C), a deep generative model that learns locally linear latent dynamics, resulting in strong performance on complex control tasks.
We introduce Embed to Control (E2C), a method for model learning and control of non-linear dynamical systems from raw pixel images. E2C consists of a deep generative model, belonging to the family of variational autoencoders, that learns to generate image trajectories from a latent space in which the dynamics is constrained to be locally linear. Our model is derived directly from an optimal control formulation in latent space, supports long-term prediction of image sequences and exhibits strong performance on a variety of complex control problems.