OptiState: State Estimation of Legged Robots using Gated Networks with Transformer-based Vision and Kalman Filtering
This work addresses the challenge of accurate state estimation for legged robots operating on various terrains, representing an incremental improvement through a hybrid approach.
The authors tackled state estimation for legged robots by integrating Kalman filtering, optimization, and learning-based methods, resulting in a 65% improvement in Root Mean Squared Error compared to a VIO SLAM baseline on hardware tests with a quadruped robot.
State estimation for legged robots is challenging due to their highly dynamic motion and limitations imposed by sensor accuracy. By integrating Kalman filtering, optimization, and learning-based modalities, we propose a hybrid solution that combines proprioception and exteroceptive information for estimating the state of the robot's trunk. Leveraging joint encoder and IMU measurements, our Kalman filter is enhanced through a single-rigid body model that incorporates ground reaction force control outputs from convex Model Predictive Control optimization. The estimation is further refined through Gated Recurrent Units, which also considers semantic insights and robot height from a Vision Transformer autoencoder applied on depth images. This framework not only furnishes accurate robot state estimates, including uncertainty evaluations, but can minimize the nonlinear errors that arise from sensor measurements and model simplifications through learning. The proposed methodology is evaluated in hardware using a quadruped robot on various terrains, yielding a 65% improvement on the Root Mean Squared Error compared to our VIO SLAM baseline. Code example: https://github.com/AlexS28/OptiState