ROApr 13
ACDC: Adaptive Curriculum Planning with Dynamic Contrastive Control for Goal-Conditioned Reinforcement Learning in Robotic ManipulationXuerui Wang, Guangyu Ren, Tianhong Dai et al.
Goal-conditioned reinforcement learning has shown considerable potential in robotic manipulation; however, existing approaches remain limited by their reliance on prioritizing collected experience, resulting in suboptimal performance across diverse tasks. Inspired by human learning behaviors, we propose a more comprehensive learning paradigm, ACDC, which integrates multidimensional Adaptive Curriculum (AC) Planning with Dynamic Contrastive (DC) Control to guide the agent along a well-designed learning trajectory. More specifically, at the planning level, the AC component schedules the learning curriculum by dynamically balancing diversity-driven exploration and quality-driven exploitation based on the agent's success rate and training progress. At the control level, the DC component implements the curriculum plan through norm-constrained contrastive learning, enabling magnitude-guided experience selection aligned with the current curriculum focus. Extensive experiments on challenging robotic manipulation tasks demonstrate that ACDC consistently outperforms the state-of-the-art baselines in both sample efficiency and final task success rate.
MAJul 15, 2025
A Learning Framework For Cooperative Collision Avoidance of UAV Swarms Leveraging Domain KnowledgeShuangyao Huang, Haibo Zhang, Zhiyi Huang
This paper presents a multi-agent reinforcement learning (MARL) framework for cooperative collision avoidance of UAV swarms leveraging domain knowledge-driven reward. The reward is derived from knowledge in the domain of image processing, approximating contours on a two-dimensional field. By modeling obstacles as maxima on the field, collisions are inherently avoided as contours never go through peaks or intersect. Additionally, counters are smooth and energy-efficient. Our framework enables training with large swarm sizes as the agent interaction is minimized and the need for complex credit assignment schemes or observation sharing mechanisms in state-of-the-art MARL approaches are eliminated. Moreover, UAVs obtain the ability to adapt to complex environments where contours may be non-viable or non-existent through intensive training. Extensive experiments are conducted to evaluate the performances of our framework against state-of-the-art MARL algorithms.
ROMay 8, 2021
$E^2Coop$: Energy Efficient and Cooperative Obstacle Detection and Avoidance for UAV SwarmsShuangyao Huang, Haibo Zhang, Zhiyi Huang
Energy efficiency is of critical importance to trajectory planning for UAV swarms in obstacle avoidance. In this paper, we present $E^2Coop$, a new scheme designed to avoid collisions for UAV swarms by tightly coupling Artificial Potential Field (APF) with Particle Swarm Planning (PSO) based trajectory planning. In $E^2Coop$, swarm members perform trajectory planning cooperatively to avoid collisions in an energy-efficient manner. $E^2Coop$ exploits the advantages of the active contour model in image processing for trajectory planning. Each swarm member plans its trajectories on the contours of the environment field to save energy and avoid collisions to obstacles. Swarm members that fall within the safeguard distance of each other plan their trajectories on different contours to avoid collisions with each other. Simulation results demonstrate that $E^2Coop$ can save energy up to 51\% compared with two state-of-the-art schemes.