Haoyao Chen

h-index5

9papers

49citations

Novelty51%

AI Score52

Ranked #32,763 of 205,806 authors (top 16%)#805 in RO (top 11%)

9 Papers

ROMar 29, 2023Code

Active Implicit Object Reconstruction using Uncertainty-guided Next-Best-View Optimization

Dongyu Yan, Jianheng Liu, Fengyu Quan et al.

Actively planning sensor views during object reconstruction is crucial for autonomous mobile robots. An effective method should be able to strike a balance between accuracy and efficiency. In this paper, we propose a seamless integration of the emerging implicit representation with the active reconstruction task. We build an implicit occupancy field as our geometry proxy. While training, the prior object bounding box is utilized as auxiliary information to generate clean and detailed reconstructions. To evaluate view uncertainty, we employ a sampling-based approach that directly extracts entropy from the reconstructed occupancy probability field as our measure of view information gain. This eliminates the need for additional uncertainty maps or learning. Unlike previous methods that compare view uncertainty within a finite set of candidates, we aim to find the next-best-view (NBV) on a continuous manifold. Leveraging the differentiability of the implicit representation, the NBV can be optimized directly by maximizing the view uncertainty using gradient descent. It significantly enhances the method's adaptability to different scenarios. Simulation and real-world experiments demonstrate that our approach effectively improves reconstruction accuracy and efficiency of view planning in active reconstruction tasks. The proposed system will open source at https://github.com/HITSZ-NRSL/ActiveImplicitRecon.git.

28.0ROApr 30Code

Adaptive Nonlinear MPC for Trajectory Tracking of An Overactuated Tiltrotor Hexacopter

Yueqian Liu, Fengyu Quan, Haoyao Chen

Omnidirectional micro aerial vehicles (OMAVs) are more capable of doing environmentally interactive tasks due to their ability to exert full wrenches while maintaining stable poses. However, OMAVs often incorporate additional actuators and complex mechanical structures to achieve omnidirectionality. Obtaining precise mathematical models is difficult, and the mismatch between the model and the real physical system is not trivial. The large model-plant mismatch significantly degrades overall system performance if a non-adaptive model predictive controller (MPC) is used. This work presents the $\mathcal{L}_1$-MPC, an adaptive nonlinear model predictive controller for accurate 6-DOF trajectory tracking of an overactuated tiltrotor hexacopter in the presence of model uncertainties and external disturbances. The $\mathcal{L}_1$-MPC adopts a cascaded system architecture in which a nominal MPC is followed and augmented by an $\mathcal{L}_1$ adaptive controller. The proposed method is evaluated against the non-adaptive MPC, the EKF-MPC, and the PID method in both numerical and PX4 software-in-the-loop simulation with Gazebo. The $\mathcal{L}_1$-MPC reduces the tracking error by around 90% when compared to a non-adaptive MPC, and the $\mathcal{L}_1$-MPC has lower tracking errors, higher uncertainty estimation rates, and less tuning requirements over the EKF-MPC. We will make the implementations, including the hardware-verified PX4 firmware and Gazebo plugins, open-source at https://github.com/HITSZ-NRSL/omniHex.

CVJul 19, 2022Code

Det6D: A Ground-Aware Full-Pose 3D Object Detector for Improving Terrain Robustness

Junyuan Ouyang, Haoyao Chen

Accurate 3D object detection with LiDAR is critical for autonomous driving. Existing research is all based on the flat-world assumption. However, the actual road can be complex with steep sections, which breaks the premise. Current methods suffer from performance degradation in this case due to difficulty correctly detecting objects on sloped terrain. In this work, we propose Det6D, the first full-degree-of-freedom 3D object detector without spatial and postural limitations, to improve terrain robustness. We choose the point-based framework by founding their capability of detecting objects in the entire spatial range. To predict full-degree poses, including pitch and roll, we design a ground-aware orientation branch that leverages the local ground constraints. Given the difficulty of long-tail non-flat scene data collection and 6D pose annotation, we present Slope-Aug, a data augmentation method for synthesizing non-flat terrain from existing datasets recorded in flat scenes. Experiments on various datasets demonstrate the effectiveness and robustness of our method in different terrains. We further conducted an extended experiment to explore how the network predicts the two extra poses. The proposed modules are plug-and-play for existing point-based frameworks. The code is available at https://github.com/HITSZ-NRSL/De6D.

CVMay 23, 2023Code

Hierarchical Adaptive Voxel-guided Sampling for Real-time Applications in Large-scale Point Clouds

Junyuan Ouyang, Xiao Liu, Haoyao Chen

While point-based neural architectures have demonstrated their efficacy, the time-consuming sampler currently prevents them from performing real-time reasoning on scene-level point clouds. Existing methods attempt to overcome this issue by using random sampling strategy instead of the commonly-adopted farthest point sampling~(FPS), but at the expense of lower performance. So the effectiveness/efficiency trade-off remains under-explored. In this paper, we reveal the key to high-quality sampling is ensuring an even spacing between points in the subset, which can be naturally obtained through a grid. Based on this insight, we propose a hierarchical adaptive voxel-guided point sampler with linear complexity and high parallelization for real-time applications. Extensive experiments on large-scale point cloud detection and segmentation tasks demonstrate that our method achieves competitive performance with the most powerful FPS, at an amazing speed that is more than 100 times faster. This breakthrough in efficiency addresses the bottleneck of the sampling step when handling scene-level point clouds. Furthermore, our sampler can be easily integrated into existing models and achieves a 20$\sim$80\% reduction in runtime with minimal effort. The code will be available at https://github.com/OuyangJunyuan/pointcloud-3d-detector-tensorrt

ROMar 24, 2021Code

Pole-like Objects Mapping and Long-Term Robot Localization in Dynamic Urban Scenarios

Zhihao Wang, Silin Li, Ming Cao et al.

Localization on 3D data is a challenging task for unmanned vehicles, especially in long-term dynamic urban scenarios. Due to the generality and long-term stability, the pole-like objects are very suitable as landmarks for unmanned vehicle localization in time-varing scenarios. In this paper, a long-term LiDAR-only localization algorithm based on semantic cluster map is proposed. At first, the Convolutional Neural Network(CNN) is used to infer the semantics of LiDAR point clouds. Combined with the point cloud segmentation, the long-term static objects pole/trunk in the scene are extracted and registered into a semantic cluster map. When the unmanned vehicle re-enters the environment again, the relocalization is completed by matching the clusters of the local map with the clusters of the global map. Furthermore, the continuous matching between the local and global maps stably outputs the global pose at 2Hz to correct the drift of the 3D LiDAR odometry. The proposed approach realizes localization in the long-term scenarios without maintaining the high-precision point cloud map. The experimental results on our campus dataset demonstrate that the proposed approach performs better in localization accuracy compared with the current state-of-the-art methods. The source of this paper is available at: http://www.github.com/HITSZ-NRSL/long-term-localization.

40.0ROApr 26

Decentralized Heterogeneous Multi-Robot Collaborative Exploration for Indoor and Outdoor 3D Environments

Yuxiang Li, Kun Chen, Jiancheng Wang et al.

Heterogeneous multi-robot systems feature significant adaptability for complex environments. However, effective collaboration that fully exploits the robots' potential remains a core challenge. This paper proposes a decentralized collaborative framework for heterogeneous multi-robot systems to autonomously explore indoor and outdoor 3D environments. First, a basic perception map that integrates terrain and observation metrics is designed. Improved supervoxel segmentation is developed to simplify the map structure and form a high-level representation that supports lightweight communication. Second, the traversal and observation capabilities of heterogeneous robots are modeled to evaluate the requirements of task views derived from incomplete supervoxels. These task views are grouped by requirements and clustered to streamline assignment. Subsequently, the view-cluster assignment is formulated as a heterogeneous multi-depot multi-traveling salesman problem (HMDMTSP) that incorporates constraints between view-cluster requirements and robot capabilities. An improved genetic algorithm is developed to efficiently solve this problem while ensuring global consistency. Based on the assignments, redundant views within clusters are eliminated to refine exploration routes. Finally, conflicts between robots' motion paths are resolved. Simulations and field experiments in cluttered indoor and outdoor environments demonstrate that our approach effectively coordinates exploration tasks among heterogeneous robots, achieving superior exploration efficiency and communication savings compared to state-of-the-art approaches.

36.5ROApr 9

EMMa: End-Effector Stability-Oriented Mobile Manipulation for Tracked Rescue Robots

Yifei Wang, Hao Zhang, Jidong Huang et al.

The autonomous operation of tracked mobile manipulators in rescue missions requires not only ensuring the reachability and safety of robot motion but also maintaining stable end-effector manipulation under diverse task demands. However, existing studies have overlooked many end-effector motion properties at both the planning and control levels. This paper presents a motion generation framework for tracked mobile manipulators to achieve stable end-effector operation in complex rescue scenarios. The framework formulates a coordinated path optimization model that couples end-effector and mobile base states and designs compact cost/constraint representations to mitigate nonlinearities and reduce computational complexity. Furthermore, an isolated control scheme with feedforward compensation and feedback regulation is developed to enable coordinated path tracking for the robot. Extensive simulated and real-world experiments on rescue scenarios demonstrate that the proposed framework consistently outperforms SOTA methods across key metrics, including task success rate and end-effector motion stability, validating its effectiveness and robustness in complex mobile manipulation tasks.

ROMar 16, 2024

MSI-NeRF: Linking Omni-Depth with View Synthesis through Multi-Sphere Image aided Generalizable Neural Radiance Field

Dongyu Yan, Guanyu Huang, Fengyu Quan et al.

Panoramic observation using fisheye cameras is significant in virtual reality (VR) and robot perception. However, panoramic images synthesized by traditional methods lack depth information and can only provide three degrees-of-freedom (3DoF) rotation rendering in VR applications. To fully preserve and exploit the parallax information within the original fisheye cameras, we introduce MSI-NeRF, which combines deep learning omnidirectional depth estimation and novel view synthesis. We construct a multi-sphere image as a cost volume through feature extraction and warping of the input images. We further build an implicit radiance field using spatial points and interpolated 3D feature vectors as input, which can simultaneously realize omnidirectional depth estimation and 6DoF view synthesis. Leveraging the knowledge from depth estimation task, our method can learn scene appearance by source view supervision only. It does not require novel target views and can be trained conveniently on existing panorama depth estimation datasets. Our network has the generalization ability to reconstruct unknown scenes efficiently using only four images. Experimental results show that our method outperforms existing methods in both depth estimation and novel view synthesis tasks.

ROMar 19, 2021

Simulation Platform for Autonomous Aerial Manipulation in Dynamic Environments

Fengyu Quan, Huisheng Huang, Hongjie Zeng et al.

The aerial manipulator (AM) is a systematic operational robotic platform in high standard on algorithm robustness. Directly deploying the algorithms to the practical system will take numerous trial and error costs and even cause destructive results. In this paper, a new modular simulation platform is designed to evaluate aerial manipulation related algorithms before deploying. In addition, to realize a fully autonomous aerial grasping, a series of algorithm modules consisting a complete workflow are designed and integrated in the simulation platform, including perception, planning and control modules. This framework empowers the AM to autonomously grasp remote targets without colliding with surrounding obstacles relying only on on-board sensors. Benefiting from its modular design, this software architecture can be easily extended with additional algorithms. Finally, several simulations are performed to verify the effectiveness of the proposed system.