Zhefan Xu

10papers

349citations

Novelty52%

AI Score52

Ranked #34,937 of 201,326 authors (top 17%)#886 in RO (top 12%)

10 Papers

ROSep 15, 2022Code

Vision-aided UAV navigation and dynamic obstacle avoidance using gradient-based B-spline trajectory optimization

Zhefan Xu, Yumeng Xiu, Xiaoyang Zhan et al.

Navigating dynamic environments requires the robot to generate collision-free trajectories and actively avoid moving obstacles. Most previous works designed path planning algorithms based on one single map representation, such as the geometric, occupancy, or ESDF map. Although they have shown success in static environments, due to the limitation of map representation, those methods cannot reliably handle static and dynamic obstacles simultaneously. To address the problem, this paper proposes a gradient-based B-spline trajectory optimization algorithm utilizing the robot's onboard vision. The depth vision enables the robot to track and represent dynamic objects geometrically based on the voxel map. The proposed optimization first adopts the circle-based guide-point algorithm to approximate the costs and gradients for avoiding static obstacles. Then, with the vision-detected moving objects, our receding-horizon distance field is simultaneously used to prevent dynamic collisions. Finally, the iterative re-guide strategy is applied to generate the collision-free trajectory. The simulation and physical experiments prove that our method can run in real-time to navigate dynamic environments safely. Our software is available on GitHub as an open-source package.

ROSep 17, 2022Code

A real-time dynamic obstacle tracking and mapping system for UAV navigation and collision avoidance with an RGB-D camera

Zhefan Xu, Xiaoyang Zhan, Baihan Chen et al.

The real-time dynamic environment perception has become vital for autonomous robots in crowded spaces. Although the popular voxel-based mapping methods can efficiently represent 3D obstacles with arbitrarily complex shapes, they can hardly distinguish between static and dynamic obstacles, leading to the limited performance of obstacle avoidance. While plenty of sophisticated learning-based dynamic obstacle detection algorithms exist in autonomous driving, the quadcopter's limited computation resources cannot achieve real-time performance using those approaches. To address these issues, we propose a real-time dynamic obstacle tracking and mapping system for quadcopter obstacle avoidance using an RGB-D camera. The proposed system first utilizes a depth image with an occupancy voxel map to generate potential dynamic obstacle regions as proposals. With the obstacle region proposals, the Kalman filter and our continuity filter are applied to track each dynamic obstacle. Finally, the environment-aware trajectory prediction method is proposed based on the Markov chain using the states of tracked dynamic obstacles. We implemented the proposed system with our custom quadcopter and navigation planner. The simulation and physical experiments show that our methods can successfully track and represent obstacles in dynamic environments in real-time and safely avoid obstacles. Our software is available on GitHub as an open-source ROS package.

ROJan 20, 2023Code

A vision-based autonomous UAV inspection framework for unknown tunnel construction sites with dynamic obstacles

Zhefan Xu, Baihan Chen, Xiaoyang Zhan et al.

Tunnel construction using the drill-and-blast method requires the 3D measurement of the excavation front to evaluate underbreak locations. Considering the inspection and measurement task's safety, cost, and efficiency, deploying lightweight autonomous robots, such as unmanned aerial vehicles (UAV), becomes more necessary and popular. Most of the previous works use a prior map for inspection viewpoint determination and do not consider dynamic obstacles. To maximally increase the level of autonomy, this paper proposes a vision-based UAV inspection framework for dynamic tunnel environments without using a prior map. Our approach utilizes a hierarchical planning scheme, decomposing the inspection problem into different levels. The high-level decision maker first determines the task for the robot and generates the target point. Then, the mid-level path planner finds the waypoint path and optimizes the collision-free static trajectory. Finally, the static trajectory will be fed into the low-level local planner to avoid dynamic obstacles and navigate to the target point. Besides, our framework contains a novel dynamic map module that can simultaneously track dynamic obstacles and represent static obstacles based on an RGB-D camera. After inspection, the Structure-from-Motion (SfM) pipeline is applied to generate the 3D shape of the target. To our best knowledge, this is the first time autonomous inspection has been realized in unknown and dynamic tunnel environments. Our flight experiments in a real tunnel prove that our method can autonomously inspect the tunnel excavation front surface. Our software is available on GitHub as an open-source ROS package.

64.6CVMay 19

VL-DPO: Vision-Language-Guided Finetuning for Preference-Aligned Autonomous Driving

Zhefan Xu, Ghassen Jerfel, Marina Haliem et al.

The rapid growth of autonomous driving datasets has enabled the scaling of powerful motion forecasting models. While large-scale pretraining provides strong performance, the standard imitation objective may not fully capture the complex nuances of human driving preferences. Meanwhile, recent advances in vision-language models (VLMs) have demonstrated impressive reasoning and commonsense understanding. Building on these capabilities, this paper presents VL-DPO, a vision-language-guided framework that aligns ego-vehicle motion forecasting models with human preferences. Our approach leverages a VLM as a zero-shot reasoner to automatically generate preference pairs from a pretrained model's rollouts, which are then used to finetune the model via Direct Preference Optimization (DPO). We finetune our models on the Waymo Open End-to-End Driving Dataset (WOD-E2E) and evaluate performance against held-out human preference annotations using rater feedback score (RFS) and average displacement error (ADE). Our experiments confirm that the VLM's trajectory selection is a high-quality proxy for human preference. Our final model, VL-DPO, yields an 11.94% increase in RFS and a 10.01% reduction in ADE over the pretrained model.

33.1ROMay 15

NavRL++: A System-Level Framework for Improving Sim-to-Real Transfer in Reinforcement Learning-Based Robot Navigation

Zhefan Xu, Hanyu Jin, Kenji Shimada

Recent years have witnessed significant progress in autonomous navigation using reinforcement learning. However, existing approaches largely emphasize reinforcement learning framework design, such as input representations, action spaces, and reward functions, while providing limited analysis of sim-to-real transfer and insufficient insight into how training strategies affect real-world deployment performance. To bridge this gap, we not only introduce an effective RL framework but also present a complete training and deployment pipeline, along with a systematic empirical study that disentangles the key factors affecting sim-to-real transfer in reinforcement learning-based navigation, including sensor noise, perception failures, system latency, and control response. Building on insights from this analysis, we introduce perturbation-aware fine-tuning, a post-training adaptation strategy that improves transfer robustness by explicitly accounting for empirically identified domain discrepancies. To further mitigate perception degradation and enhance control smoothness in real-world deployment, we propose a Transformer-based temporal reasoning policy that leverages short-horizon observation for navigation control. We quantitatively evaluate how individual sim-to-real perturbations and training design choices impact navigation performance across environments. Experimental results demonstrate that the proposed training strategy and policy architecture outperform learning-based baselines in both static and dynamic environments, while achieving performance comparable to optimization-based planners in static settings. We validate our approach through real-world deployment on multiple robotic platforms, including aerial and legged robots, across navigation-centric tasks such as exploration and inspection, demonstrating zero-shot sim-to-real transfer.

26.8ROMay 11

ASIP-Planner: Adaptive Planning for UAV Surface Inspection in Partially Known Indoor Environments

Hanyu Jin, Zhefan Xu, Haoyu Shen et al.

Indoor infrastructure inspection, such as tunnels and industrial facilities, requires systematic surface coverage to ensure that all inspection targets are properly observed. Unmanned Aerial Vehicles (UAVs) offer an alternative to manual inspection by conducting map-guided surface inspection using prior structural models. However, in practice, indoor inspection often relies on floorplan-derived reference maps that may not reflect unforeseen obstacles, such as temporary structures or equipment, leading to occluded viewpoints and degraded inspection quality. Existing coverage planning methods typically assume a fully known inspection environment and perform deterministic global viewpoint optimization based on accurate prior maps, making them vulnerable to environmental discrepancies during execution. This work presents an adaptive UAV inspection framework for partially known structured indoor environments. The proposed method integrates a segment-based global coverage planner with an inspection-oriented local view-angle adaptation module. The global planner organizes planar inspection targets into surface-aligned clusters to generate compact viewpoint sequences with improved orientation consistency. The local planner generates collision-free trajectories and adjusts the viewing direction online to mitigate occlusion-induced coverage loss while preserving the planned trajectory structure. The simulation results across randomized scene configurations demonstrate that the proposed global planner achieves near-complete coverage while reducing trajectory length compared to representative baselines. Real-world flight experiments further validate that the framework produces usable inspection data for downstream analysis. These results indicate that the proposed framework improves inspection efficiency and adaptability in partially known structured indoor environments.

ROSep 14, 2021

DPMPC-Planner: A real-time UAV trajectory planning framework for complex static environments with dynamic obstacles

Zhefan Xu, Di Deng, Yiping Dong et al.

Safe UAV navigation is challenging due to the complex environment structures, dynamic obstacles, and uncertainties from measurement noises and unpredictable moving obstacle behaviors. Although plenty of recent works achieve safe navigation in complex static environments with sophisticated mapping algorithms, such as occupancy map and ESDF map, these methods cannot reliably handle dynamic environments due to the mapping limitation from moving obstacles. To address the limitation, this paper proposes a trajectory planning framework to achieve safe navigation considering complex static environments with dynamic obstacles. To reliably handle dynamic obstacles, we divide the environment representation into static mapping and dynamic object representation, which can be obtained from computer vision methods. Our framework first generates a static trajectory based on the proposed iterative corridor shrinking algorithm. Then, reactive chance-constrained model predictive control with temporal goal tracking is applied to avoid dynamic obstacles with uncertainties. The simulation results in various environments demonstrate the ability of our algorithm to navigate safely in complex static environments with dynamic obstacles.

RONov 10, 2020

Frontier-based Automatic-differentiable Information Gain Measure for Robotic Exploration of Unknown 3D Environments

Di Deng, Zhefan Xu, Wenbo Zhao et al.

The path planning problem for autonomous exploration of an unknown region by a robotic agent typically employs frontier-based or information-theoretic heuristics. Frontier-based heuristics typically evaluate the information gain of a viewpoint by the number of visible frontier voxels, which is a discrete measure that can only be optimized by sampling. On the other hand, information-theoretic heuristics compute information gain as the mutual information between the map and the sensor's measurement. Although the gradient of such measures can be computed, the computation involves costly numerical differentiation. In this work, we add a novel fuzzy logic filter in the counting of visible frontier voxels surrounding a viewpoint, which allows the gradient of the information gain with respect to the viewpoint to be efficiently computed using automatic differentiation. This enables us to simultaneously optimize information gain with other differentiable quality measures such as path length. Using multiple simulation environments, we demonstrate that the proposed gradient-based optimization method consistently improves the information gain and other quality measures of exploration paths.

RONov 10, 2020

Coordinated Aerial-Ground Robot Exploration via Monte-Carlo View Quality Rendering

Di Deng, Zhefan Xu, Wenbo Zhao et al.

We present a framework for a ground-aerial robotic team to explore large, unstructured, and unknown environments. In such exploration problems, the effectiveness of existing exploration-boosting heuristics often scales poorly with the environments' size and complexity. This work proposes a novel framework combining incremental frontier distribution, goal selection with Monte-Carlo view quality rendering, and an automatic-differentiable information gain measure to improve exploration efficiency. Simulated with multiple complex environments, we demonstrate that the proposed method effectively utilizes collaborative aerial and ground robots, consistently guides agents to informative viewpoints, improves exploration paths' information gain, and reduces planning time.

ROOct 14, 2020

Autonomous UAV Exploration of Dynamic Environments via Incremental Sampling and Probabilistic Roadmap

Zhefan Xu, Di Deng, Kenji Shimada

Autonomous exploration requires robots to generate informative trajectories iteratively. Although sampling-based methods are highly efficient in unmanned aerial vehicle exploration, many of these methods do not effectively utilize the sampled information from the previous planning iterations, leading to redundant computation and longer exploration time. Also, few have explicitly shown their exploration ability in dynamic environments even though they can run real-time. To overcome these limitations, we propose a novel dynamic exploration planner (DEP) for exploring unknown environments using incremental sampling and Probabilistic Roadmap (PRM). In our sampling strategy, nodes are added incrementally and distributed evenly in the explored region, yielding the best viewpoints. To further shortening exploration time and ensuring safety, our planner optimizes paths locally and refine them based on the Euclidean Signed Distance Function (ESDF) map. Meanwhile, as the multi-query planner, PRM allows the proposed planner to quickly search alternative paths to avoid dynamic obstacles for safe exploration. Simulation experiments show that our method safely explores dynamic environments and outperforms the benchmark planners in terms of exploration time, path length, and computational time.