Yuman Gao

h-index11

4papers

175citations

Novelty50%

AI Score37

Ranked #95,325 of 194,257 authors (top 49%)#2,859 in RO (top 42%)

4 Papers

14.5ROMay 20, 2025

Toward Real-World Cooperative and Competitive Soccer with Quadrupedal Robot Teams

Zhi Su, Yuman Gao, Emily Lukas et al. · bytedance

Achieving coordinated teamwork among legged robots requires both fine-grained locomotion control and long-horizon strategic decision-making. Robot soccer offers a compelling testbed for this challenge, combining dynamic, competitive, and multi-agent interactions. In this work, we present a hierarchical multi-agent reinforcement learning (MARL) framework that enables fully autonomous and decentralized quadruped robot soccer. First, a set of highly dynamic low-level skills is trained for legged locomotion and ball manipulation, such as walking, dribbling, and kicking. On top of these, a high-level strategic planning policy is trained with Multi-Agent Proximal Policy Optimization (MAPPO) via Fictitious Self-Play (FSP). This learning framework allows agents to adapt to diverse opponent strategies and gives rise to sophisticated team behaviors, including coordinated passing, interception, and dynamic role allocation. With an extensive ablation study, the proposed learning method shows significant advantages in the cooperative and competitive multi-agent soccer game. We deploy the learned policies to real quadruped robots relying solely on onboard proprioception and decentralized localization, with the resulting system supporting autonomous robot-robot and robot-human soccer matches on indoor and outdoor soccer courts.

13.8ROSep 16, 2021

Meeting-Merging-Mission: A Multi-robot Coordinate Framework for Large-Scale Communication-Limited Exploration

Yuman Gao, Yingjian Wang, Xingguang Zhong et al.

This letter presents a complete framework Meeting-Merging-Mission for multi-robot exploration under communication restriction. Considering communication is limited in both bandwidth and range in the real world, we propose a lightweight environment presentation method and an efficient cooperative exploration strategy. For lower bandwidth, each robot utilizes specific polytopes to maintains free space and super frontier information (SFI) as the source for exploration decision-making. To reduce repeated exploration, we develop a mission-based protocol that drives robots to share collected information in stable rendezvous. We also design a complete path planning scheme for both centralized and decentralized cases. To validate that our framework is practical and generic, we present an extensive benchmark and deploy our system into multi-UGV and multi-UAV platforms.

15.6ROMar 11, 2021Code

Visibility-aware Trajectory Optimization with Application to Aerial Tracking

Qianhao Wang, Yuman Gao, Jialin Ji et al.

The visibility of targets determines performance and even success rate of various applications, such as active slam, exploration, and target tracking. Therefore, it is crucial to take the visibility of targets into explicit account in trajectory planning. In this paper, we propose a general metric for target visibility, considering observation distance and angle as well as occlusion effect. We formulate this metric into a differentiable visibility cost function, with which spatial trajectory and yaw can be jointly optimized. Furthermore, this visibility-aware trajectory optimization handles dynamic feasibility of position and yaw simultaneously. To validate that our method is practical and generic, we integrate it into a customized quadrotor tracking system. The experimental results show that our visibility-aware planner performs more robustly and observes targets better. In order to benefit related researches, we release our code to the public.

17.3RONov 8, 2020Code

Learning-based 3D Occupancy Prediction for Autonomous Navigation in Occluded Environments

Lizi Wang, Hongkai Ye, Qianhao Wang et al.

In autonomous navigation of mobile robots, sensors suffer from massive occlusion in cluttered environments, leaving significant amount of space unknown during planning. In practice, treating the unknown space in optimistic or pessimistic ways both set limitations on planning performance, thus aggressiveness and safety cannot be satisfied at the same time. However, humans can infer the exact shape of the obstacles from only partial observation and generate non-conservative trajectories that avoid possible collisions in occluded space. Mimicking human behavior, in this paper, we propose a method based on deep neural network to predict occupancy distribution of unknown space reliably. Specifically, the proposed method utilizes contextual information of environments and learns from prior knowledge to predict obstacle distributions in occluded space. We use unlabeled and no-ground-truth data to train our network and successfully apply it to real-time navigation in unseen environments without any refinement. Results show that our method leverages the performance of a kinodynamic planner by improving security with no reduction of speed in clustered environments.