Linear Policies are Sufficient to Realize Robust Bipedal Walking on Challenging Terrains
This work addresses robust locomotion for bipedal robots in real-world environments, representing an incremental improvement in control simplicity and efficiency.
The authors tackled robust bipedal walking on challenging terrains by learning a single linear policy with only 13 parameters, achieving performance comparable to neural networks without loss on slopes, stairs, and outdoor landscapes, and successfully transferred from simulation to hardware.
In this work, we demonstrate robust walking in the bipedal robot Digit on uneven terrains by just learning a single linear policy. In particular, we propose a new control pipeline, wherein the high-level trajectory modulator shapes the end-foot ellipsoidal trajectories, and the low-level gait controller regulates the torso and ankle orientation. The foot-trajectory modulator uses a linear policy and the regulator uses a linear PD control law. As opposed to neural network-based policies, the proposed linear policy has only 13 learnable parameters, thereby not only guaranteeing sample efficient learning but also enabling simplicity and interpretability of the policy. This is achieved with no loss of performance on challenging terrains like slopes, stairs and outdoor landscapes. We first demonstrate robust walking in the custom simulation environment, MuJoCo, and then directly transfer to hardware with no modification of the control pipeline. We subject the biped to a series of pushes and terrain height changes, both indoors and outdoors, thereby validating the presented work.