NewtonianVAE: Proportional Control and Goal Identification from Pixels via Physical Latent Spaces
This addresses the challenge of vision-based planning and control in robotics by improving efficiency and interpretability, though it appears incremental as it builds on existing latent dynamics learning paradigms.
The paper tackled the problem of enabling proportional control from pixels by learning a latent dynamics model designed for controllability, resulting in simpler controllers, faster behavioral cloning, and interpretable goal discovery.
Learning low-dimensional latent state space dynamics models has been a powerful paradigm for enabling vision-based planning and learning for control. We introduce a latent dynamics learning framework that is uniquely designed to induce proportional controlability in the latent space, thus enabling the use of much simpler controllers than prior work. We show that our learned dynamics model enables proportional control from pixels, dramatically simplifies and accelerates behavioural cloning of vision-based controllers, and provides interpretable goal discovery when applied to imitation learning of switching controllers from demonstration.