ROMar 16

CycleRL: Sim-to-Real Deep Reinforcement Learning for Robust Autonomous Bicycle Control

arXiv:2603.1501326.5h-index: 1
AI Analysis

This addresses robust control for autonomous bicycles in urban mobility, offering improved adaptability over traditional methods, though it is incremental as it applies existing DRL techniques to a new domain.

The paper tackled autonomous bicycle control by developing CycleRL, a sim-to-real deep reinforcement learning framework, achieving a 99.90% balance success rate and low tracking errors in simulation with successful hardware transfer.

Autonomous bicycles offer a promising agile solution for urban mobility and last-mile logistics, however, conventional control strategies often struggle with their underactuated nonlinear dynamics, suffering from sensitivity to model mismatches and limited adaptability to real-world uncertainties. To address this, this paper presents CycleRL, the first sim-to-real deep reinforcement learning framework designed for robust autonomous bicycle control. Our approach trains an end-to-end neural control policy within the high-fidelity NVIDIA Isaac Sim environment, leveraging Proximal Policy Optimization (PPO) to circumvent the need for an explicit dynamics model. The framework features a composite reward function tailored for concurrent balance maintenance, velocity tracking, and steering control. Crucially, systematic domain randomization is employed to bridge the simulation-to-reality gap and facilitate direct transfer. In simulation, CycleRL achieves considerable performance, including a 99.90% balance success rate, a low steering tracking error of 1.15°, and a velocity tracking error of 0.18 m/s. These quantitative results, coupled with successful hardware transfer, validate DRL as an effective paradigm for autonomous bicycle control, offering superior adaptability over traditional methods. Video demonstrations are available at https://anony6f05.github.io/CycleRL/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes