Learn Proportional Derivative Controllable Latent Space from Pixels
This work addresses computational inefficiency in vision-based control for robotics or autonomous systems, representing an incremental improvement over existing latent space dynamics models.
The paper tackles the challenge of real-time vision-based model predictive control by learning a proportional derivative controllable latent space from pixels, enabling the use of a simple PD-controller for effective control. It shows that this method outperforms baselines in goal reaching and trajectory tracking across various environments.
Recent advances in latent space dynamics model from pixels show promising progress in vision-based model predictive control (MPC). However, executing MPC in real time can be challenging due to its intensive computational cost in each timestep. We propose to introduce additional learning objectives to enforce that the learned latent space is proportional derivative controllable. In execution time, the simple PD-controller can be applied directly to the latent space encoded from pixels, to produce simple and effective control to systems with visual observations. We show that our method outperforms baseline methods to produce robust goal reaching and trajectory tracking in various environments.