RO AI LGOct 21, 2025

PGTT: Phase-Guided Terrain Traversal for Perceptive Legged Locomotion

Alexandros Ntagkas, Chairi Kiourt, Konstantinos Chatzilygeroudis

arXiv:2510.18348v13.2h-index: 12

Originality Incremental advance

AI Analysis

This addresses robust terrain traversal for legged robots, offering a generalizable solution across different robot morphologies, though it is incremental in improving existing RL methods.

The paper tackled the problem of perceptive legged locomotion by proposing PGTT, a deep-RL approach that uses phase-guided reward shaping to enforce gait structure without action priors, achieving a median +7.5% higher success under push disturbances and +9% on discrete obstacles compared to other methods.

State-of-the-art perceptive Reinforcement Learning controllers for legged robots either (i) impose oscillator or IK-based gait priors that constrain the action space, add bias to the policy optimization and reduce adaptability across robot morphologies, or (ii) operate "blind", which struggle to anticipate hind-leg terrain, and are brittle to noise. In this paper, we propose Phase-Guided Terrain Traversal (PGTT), a perception-aware deep-RL approach that overcomes these limitations by enforcing gait structure purely through reward shaping, thereby reducing inductive bias in policy learning compared to oscillator/IK-conditioned action priors. PGTT encodes per-leg phase as a cubic Hermite spline that adapts swing height to local heightmap statistics and adds a swing-phase contact penalty, while the policy acts directly in joint space supporting morphology-agnostic deployment. Trained in MuJoCo (MJX) on procedurally generated stair-like terrains with curriculum and domain randomization, PGTT achieves the highest success under push disturbances (median +7.5% vs. the next best method) and on discrete obstacles (+9%), with comparable velocity tracking, and converging to an effective policy roughly 2x faster than strong end-to-end baselines. We validate PGTT on a Unitree Go2 using a real-time LiDAR elevation-to-heightmap pipeline, and we report preliminary results on ANYmal-C obtained with the same hyperparameters. These findings indicate that terrain-adaptive, phase-guided reward shaping is a simple and general mechanism for robust perceptive locomotion across platforms.

View on arXiv PDF

Similar