Xuerui Wang

8.8ROApr 13

ACDC: Adaptive Curriculum Planning with Dynamic Contrastive Control for Goal-Conditioned Reinforcement Learning in Robotic Manipulation

Xuerui Wang, Guangyu Ren, Tianhong Dai et al.

Goal-conditioned reinforcement learning has shown considerable potential in robotic manipulation; however, existing approaches remain limited by their reliance on prioritizing collected experience, resulting in suboptimal performance across diverse tasks. Inspired by human learning behaviors, we propose a more comprehensive learning paradigm, ACDC, which integrates multidimensional Adaptive Curriculum (AC) Planning with Dynamic Contrastive (DC) Control to guide the agent along a well-designed learning trajectory. More specifically, at the planning level, the AC component schedules the learning curriculum by dynamically balancing diversity-driven exploration and quality-driven exploitation based on the agent's success rate and training progress. At the control level, the DC component implements the curriculum plan through norm-constrained contrastive learning, enabling magnitude-guided experience selection aligned with the current curriculum focus. Extensive experiments on challenging robotic manipulation tasks demonstrate that ACDC consistently outperforms the state-of-the-art baselines in both sample efficiency and final task success rate.

ROFeb 18, 2020

Incremental Nonlinear Fault-Tolerant Control of a Quadrotor with Complete Loss of Two Opposing Rotors

Sihao Sun, Xuerui Wang, Qiping Chu et al.

In order to further expand the flight envelope of quadrotors under actuator failures, we design a nonlinear sensor-based fault-tolerant controller to stabilize a quadrotor with failure of two opposing rotors in the high-speed flight condition (> 8m/s). The incremental nonlinear dynamic inversion (INDI) approach which excels in handling model uncertainties is adopted to compensate for the significant unknown aerodynamic effects. The internal dynamics of such an underactuated system have been analyzed, and subsequently stabilized by re-defining the control output. The proposed method can be generalized to control a quadrotor under single-rotor-failure and nominal conditions. For validation, flight tests have been carried out in a large-scale open jet wind tunnel. The position of a damaged quadrotor can be controlled in the presence of significant wind disturbances. A linear quadratic regulator (LQR) approach from the literature has been compared to demonstrate the advantages of the proposed nonlinear method in the windy and high-speed flight condition.

Xuerui Wang

2 Papers