ValueNetQP: Learned one-step optimal control for legged locomotion
This addresses the problem of slow optimal control for legged robots, enabling real-time model predictive control with complex dynamics, though it appears incremental as it builds on existing optimal control and learning approaches.
The paper tackles the challenge of real-time optimal control for legged locomotion by learning to predict the gradient and Hessian of the value function, enabling fast resolution with a one-step quadratic program while satisfying constraints like friction cones. The method is demonstrated in simulation and on a real quadruped robot performing trotting and bounding motions.
Optimal control is a successful approach to generate motions for complex robots, in particular for legged locomotion. However, these techniques are often too slow to run in real time for model predictive control or one needs to drastically simplify the dynamics model. In this work, we present a method to learn to predict the gradient and hessian of the problem value function, enabling fast resolution of the predictive control problem with a one-step quadratic program. In addition, our method is able to satisfy constraints like friction cones and unilateral constraints, which are important for high dynamics locomotion tasks. We demonstrate the capability of our method in simulation and on a real quadruped robot performing trotting and bounding motions.