RO AI LGMar 26, 2021

Learning Reactive and Predictive Differentiable Controllers for Switching Linear Dynamical Models

Saumya Saxena, Alex LaGrassa, Oliver Kroemer

arXiv:2103.14256v15.34 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of learning control strategies for robots in dynamic, contact-rich environments, which is incremental as it builds on existing methods like LQR and switching models.

The authors tackled the problem of learning composite dynamical behaviors for robots in tasks involving contact switching, such as grasping while walking, by developing a framework that learns a switching linear dynamical model and uses differentiable LQR policies. They demonstrated generalization to different scenarios and robustness to model inaccuracies in simulations and real-world experiments, though specific numerical results were not provided.

Humans leverage the dynamics of the environment and their own bodies to accomplish challenging tasks such as grasping an object while walking past it or pushing off a wall to turn a corner. Such tasks often involve switching dynamics as the robot makes and breaks contact. Learning these dynamics is a challenging problem and prone to model inaccuracies, especially near contact regions. In this work, we present a framework for learning composite dynamical behaviors from expert demonstrations. We learn a switching linear dynamical model with contacts encoded in switching conditions as a close approximation of our system dynamics. We then use discrete-time LQR as the differentiable policy class for data-efficient learning of control to develop a control strategy that operates over multiple dynamical modes and takes into account discontinuities due to contact. In addition to predicting interactions with the environment, our policy effectively reacts to inaccurate predictions such as unanticipated contacts. Through simulation and real world experiments, we demonstrate generalization of learned behaviors to different scenarios and robustness to model inaccuracies during execution.

View on arXiv PDF

Similar