ROSep 17, 2024Code
ULOC: Learning to Localize in Complex Large-Scale Environments with Ultra-Wideband RangesThien-Minh Nguyen, Yizhuo Yang, Tien-Dat Nguyen et al.
While UWB-based methods can achieve high localization accuracy in small-scale areas, their accuracy and reliability are significantly challenged in large-scale environments. In this paper, we propose a learning-based framework named ULOC for Ultra-Wideband (UWB) based localization in such complex large-scale environments. First, anchors are deployed in the environment without knowledge of their actual position. Then, UWB observations are collected when the vehicle travels in the environment. At the same time, map-consistent pose estimates are developed from registering (onboard self-localization) data with the prior map to provide the training labels. We then propose a network based on MAMBA that learns the ranging patterns of UWBs over a complex large-scale environment. The experiment demonstrates that our solution can ensure high localization accuracy on a large scale compared to the state-of-the-art. We release our source code to benefit the community at https://github.com/brytsknguyen/uloc.
SYJul 5, 2018
An integrated localization-navigation scheme for distance-based docking of UAVsThien-Minh Nguyen, Zhirong Qiu, Muqing Cao et al.
In this paper we study the distance-based docking problem of unmanned aerial vehicles (UAVs) by using a single landmark placed at an arbitrarily unknown position. To solve the problem, we propose an integrated estimation-control scheme to simultaneously achieve the relative localization and navigation tasks for discrete-time integrators under bounded velocity: a nonlinear adaptive estimation scheme to estimate the relative position to the landmark, and a delicate control scheme to ensure both the convergence of the estimation and the asymptotic docking at the given landmark. A rigorous proof of convergence is provided by invoking the discrete-time LaSalle's invariance principle, and we also validate our theoretical findings on quadcopters equipped with ultra-wideband ranging sensors and optical flow sensors in a GPS-less environment.
SYMay 31, 2018
Least-square based recursive optimization for distance-based source localizationThien-Minh Nguyen, Lihua Xie
In this paper we study the problem of driving an agent to an unknown source whose location is estimated in real-time by a recursive optimization algorithm. The optimization criterion is subject to a least-square cost function constructed from the distance measurements to the target combined with the agent's self-odometry. In this work, two important issues concerning real world application are directly addressed, which is a discrete-time recursive algorithm for concurrent control and estimation, and consideration for input saturation. It is proven that with proper choices of the system's parameters, stability of all system states, including on-board estimator variables and the agent-target relative position can be achieved. The convergence of the agent's position to the target is also investigated via numerical simulation.
ROApr 4
Watch Your Step: Learning Semantically-Guided Locomotion in Cluttered EnvironmentDenan Liang, Yuan Zhu, Ruimeng Liu et al.
Although legged robots demonstrate impressive mobility on rough terrain, using them safely in cluttered environments remains a challenge. A key issue is their inability to avoid stepping on low-lying objects, such as high-cost small devices or cables on flat ground. This limitation arises from a disconnection between high-level semantic understanding and low-level control, combined with errors in elevation maps during real-world operation. To address this, we introduce SemLoco, a Reinforcement Learning (RL) framework designed to avoid obstacles precisely in densely cluttered environments. SemLoco uses a two-stage RL approach that combines both soft and hard constraints. It performs pixel-wise foothold safety inference, which enables more accurate foot placement. Additionally, SemLoco integrates semantic map, allowing it to assign traversability costs instead of relying only on geometric data. SemLoco greatly reduces collisions and improves safety around sensitive objects, enabling reliable navigation in situations where traditional controllers would likely cause damage. Experimental results further show that SemLoco can be effectively applied to more complex, unstructured real-world environments. A demo video can be view at https://youtu.be/FSq-RSmIxOM.
ROFeb 6, 2024Code
MMAUD: A Comprehensive Multi-Modal Anti-UAV Dataset for Modern Miniature Drone ThreatsShenghai Yuan, Yizhuo Yang, Thien Hoang Nguyen et al.
In response to the evolving challenges posed by small unmanned aerial vehicles (UAVs), which possess the potential to transport harmful payloads or independently cause damage, we introduce MMAUD: a comprehensive Multi-Modal Anti-UAV Dataset. MMAUD addresses a critical gap in contemporary threat detection methodologies by focusing on drone detection, UAV-type classification, and trajectory estimation. MMAUD stands out by combining diverse sensory inputs, including stereo vision, various Lidars, Radars, and audio arrays. It offers a unique overhead aerial detection vital for addressing real-world scenarios with higher fidelity than datasets captured on specific vantage points using thermal and RGB. Additionally, MMAUD provides accurate Leica-generated ground truth data, enhancing credibility and enabling confident refinement of algorithms and models, which has never been seen in other datasets. Most existing works do not disclose their datasets, making MMAUD an invaluable resource for developing accurate and efficient solutions. Our proposed modalities are cost-effective and highly adaptable, allowing users to experiment and implement new UAV threat detection tools. Our dataset closely simulates real-world scenarios by incorporating ambient heavy machinery sounds. This approach enhances the dataset's applicability, capturing the exact challenges faced during proximate vehicular operations. It is expected that MMAUD can play a pivotal role in advancing UAV threat detection, classification, trajectory estimation capabilities, and beyond. Our dataset, codes, and designs will be available in https://github.com/ntu-aris/MMAUD.
ROMar 16
Topological Motion Planning Diffusion: Generative Tangle-Free Path Planning for Tethered Robots in Obstacle-Rich EnvironmentsYifu Tian, Xinhang Xu, Thien-Minh Nguyen et al.
In extreme environments such as underwater exploration and post-disaster rescue, tethered robots require continuous navigation while avoiding cable entanglement. Traditional planners struggle in these lifelong planning scenarios due to topological unawareness, while topology-augmented graph-search methods face computational bottlenecks in obstacle-rich environments where the number of candidate topological classes increases. To address these challenges, we propose Topological Motion Planning Diffusion (TMPD), a novel generative planning framework that integrates lifelong topological memory. Instead of relying on sequential graph search, TMPD leverages a diffusion model to propose a multimodal front-end of kinematically feasible trajectory candidates across various homotopy classes. A tether-aware topological back-end then filters and optimizes these candidates by computing generalized winding numbers to evaluate their topological energy against the accumulated tether configuration. Benchmarking in obstacle-rich simulated environments demonstrates that TMPD achieves a collision-free reach of 100% and a tangle-free rate of 97.0%, outperforming traditional topological search and purely kinematic diffusion baselines in both geometric smoothness and computational efficiency. Simulation with realistic cable dynamics further validates the practicality of the proposed approach.
ROMar 8Code
PanoDP: Learning Collision-Free Navigation with Panoramic Depth and Differentiable PhysicsHao Zhong, Pei Chi, Jiang Zhao et al.
Autonomous collision-free navigation in cluttered environments requires safe decision-making under partial observability with both static structure and dynamic obstacles. We present \textbf{PanoDP}, a communication-free learning framework that combines four-view panoramic depth perception with differentiable-physics-based training signals. PanoDP encodes panoramic depth using a lightweight CNN and optimizes policies with dense differentiable collision and motion-feasibility terms, improving training stability beyond sparse terminal collisions. We evaluate PanoDP on a controlled ring-to-center benchmark with systematic sweeps over agent count, obstacle density/layout, and dynamic behaviors, and further test out-of-distribution generalization in an external simulator (e.g., AirSim). Across settings, PanoDP increases collision-free and completion rates over single-view and non-physics-guided baselines under matched training budgets, and ablations (view masking, rotation augmentation) confirm the policy leverages 360-degree information. Code will be open source upon acceptance.
ROMar 18, 2024
MCD: Diverse Large-Scale Multi-Campus Dataset for Robot PerceptionThien-Minh Nguyen, Shenghai Yuan, Thien Hoang Nguyen et al.
Perception plays a crucial role in various robot applications. However, existing well-annotated datasets are biased towards autonomous driving scenarios, while unlabelled SLAM datasets are quickly over-fitted, and often lack environment and domain variations. To expand the frontier of these fields, we introduce a comprehensive dataset named MCD (Multi-Campus Dataset), featuring a wide range of sensing modalities, high-accuracy ground truth, and diverse challenging environments across three Eurasian university campuses. MCD comprises both CCS (Classical Cylindrical Spinning) and NRE (Non-Repetitive Epicyclic) lidars, high-quality IMUs (Inertial Measurement Units), cameras, and UWB (Ultra-WideBand) sensors. Furthermore, in a pioneering effort, we introduce semantic annotations of 29 classes over 59k sparse NRE lidar scans across three domains, thus providing a novel challenge to existing semantic segmentation research upon this largely unexplored lidar modality. Finally, we propose, for the first time to the best of our knowledge, continuous-time ground truth based on optimization-based registration of lidar-inertial data on large survey-grade prior maps, which are also publicly released, each several times the size of existing ones. We conduct a rigorous evaluation of numerous state-of-the-art algorithms on MCD, report their performance, and highlight the challenges awaiting solutions from the research community.
CVMar 10, 2024
PSS-BA: LiDAR Bundle Adjustment with Progressive Spatial SmoothingJianping Li, Thien-Minh Nguyen, Shenghai Yuan et al.
Accurate and consistent construction of point clouds from LiDAR scanning data is fundamental for 3D modeling applications. Current solutions, such as multiview point cloud registration and LiDAR bundle adjustment, predominantly depend on the local plane assumption, which may be inadequate in complex environments lacking of planar geometries or substantial initial pose errors. To mitigate this problem, this paper presents a LiDAR bundle adjustment with progressive spatial smoothing, which is suitable for complex environments and exhibits improved convergence capabilities. The proposed method consists of a spatial smoothing module and a pose adjustment module, which combines the benefits of local consistency and global accuracy. With the spatial smoothing module, we can obtain robust and rich surface constraints employing smoothing kernels across various scales. Then the pose adjustment module corrects all poses utilizing the novel surface constraints. Ultimately, the proposed method simultaneously achieves fine poses and parametric surfaces that can be directly employed for high-quality point cloud reconstruction. The effectiveness and robustness of our proposed approach have been validated on both simulation and real-world datasets. The experimental results demonstrate that the proposed method outperforms the existing methods and achieves better accuracy in complex environments with low planar structures.
ROMar 6
Task-Level Decisions to Gait Level Control: A Hierarchical Policy Approach for Quadruped NavigationSijia Li, Haoyu Wang, Shenghai Yuan et al.
Real-world quadruped navigation is constrained by a scale mismatch between high-level navigation decisions and low-level gait execution, as well as by instabilities under out-of-distribution environmental changes. Such variations challenge sim-to-real transfer and can trigger falls when policies lack explicit interfaces for adaptation. In this paper, we present a hierarchical policy architecture for quadrupedal navigation, termed Task-level Decision to Gait Control (TDGC). A low-level policy, trained with reinforcement learning in simulation, delivers gait-conditioned locomotion and maps task requirements to a compact set of controllable behavior parameters, enabling robust mode generation and smooth switching. A high-level policy makes task-centric decisions from sparse semantic or geometric terrain cues and translates them into low-level targets, forming a traceable decision pipeline without dense maps or high-resolution terrain reconstruction. Different from end-to-end approaches, our architecture provides explicit interfaces for deployment-time tuning, fault diagnosis, and policy refinement. We introduce a structured curriculum with performance-driven progression that expands environmental difficulty and disturbance ranges. Experiments show higher task success rates on mixed terrains and out-of-distribution tests.
ROMar 8
Multi-Agent Off-World Exploration for Sparse Evidence Discovery via Gaussian Belief Mapping and Dual-Domain CoverageZhuoran Qiao, Tianxin Hu, Thien-Minh Nguyen et al.
Off-world multi-robot exploration is challenged by sparse targets, limited sensing, hazardous terrain, and restricted communication. Many scientifically valuable clues are visually ambiguous and often require close-range observations, making efficient and safe informative path planning essential. Existing methods often rely on predefined areas of interest (AOIs), which may be incomplete or biased, and typically handle terrain risk only through soft penalties, which are insufficient for avoiding non-recoverable regions. To address these issues, we propose a multi-agent informative path planning framework for sparse evidence discovery based on Gaussian belief mapping and dual-domain coverage. The method maintains Gaussian-process-based interest and risk beliefs and combines them with trajectory-intent representations to support coordinated sequential decision-making among multiple agents. It further prioritizes search inside the AOI while preserving limited exploration outside it, thereby improving robustness to AOI bias. In addition, the risk-aware design helps agents balance information gain and operational safety in hazardous environments. Experimental results in simulated lunar environments show that the proposed method consistently outperforms sampling-based and greedy baselines under different budgets and communication ranges. In particular, it achieves lower final uncertainty in risk-aware settings and remains robust under limited communication, demonstrating its effectiveness for cooperative off-world robotic exploration.
ROFeb 1, 2022
NTU VIRAL: A Visual-Inertial-Ranging-Lidar Dataset, From an Aerial Vehicle ViewpointThien-Minh Nguyen, Shenghai Yuan, Muqing Cao et al.
In recent years, autonomous robots have become ubiquitous in research and daily life. Among many factors, public datasets play an important role in the progress of this field, as they waive the tall order of initial investment in hardware and manpower. However, for research on autonomous aerial systems, there appears to be a relative lack of public datasets on par with those used for autonomous driving and ground robots. Thus, to fill in this gap, we conduct a data collection exercise on an aerial platform equipped with an extensive and unique set of sensors: two 3D lidars, two hardware-synchronized global-shutter cameras, multiple Inertial Measurement Units (IMUs), and especially, multiple Ultra-wideband (UWB) ranging units. The comprehensive sensor suite resembles that of an autonomous driving car, but features distinct and challenging characteristics of aerial operations. We record multiple datasets in several challenging indoor and outdoor conditions. Calibration results and ground truth from a high-accuracy laser tracker are also included in each package. All resources can be accessed via our webpage https://ntu-aris.github.io/ntu_viral_dataset.
ROMay 7, 2021
VIRAL SLAM: Tightly Coupled Camera-IMU-UWB-Lidar SLAMThien-Minh Nguyen, Shenghai Yuan, Muqing Cao et al.
In this paper, we propose a tightly-coupled, multi-modal simultaneous localization and mapping (SLAM) framework, integrating an extensive set of sensors: IMU, cameras, multiple lidars, and Ultra-wideband (UWB) range measurements, hence referred to as VIRAL (visual-inertial-ranging-lidar) SLAM. To achieve such a comprehensive sensor fusion system, one has to tackle several challenges such as data synchronization, multi-threading programming, bundle adjustment (BA), and conflicting coordinate frames between UWB and the onboard sensors, so as to ensure real-time localization and smooth updates in the state estimates. To this end, we propose a two stage approach. In the first stage, lidar, camera, and IMU data on a local sliding window are processed in a core odometry thread. From this local graph, new key frames are evaluated for admission to a global map. Visual feature-based loop closure is also performed to supplement the global factor graph with loop constraints. When the global factor graph satisfies a condition on spatial diversity, the BA process will be triggered to update the coordinate transform between UWB and onboard SLAM systems. The system then seamlessly transitions to the second stage where all sensors are tightly integrated in the odometry thread. The capability of our system is demonstrated via several experiments on high-fidelity graphical-physical simulation and public datasets.
ROApr 24, 2021
MILIOM: Tightly Coupled Multi-Input Lidar-Inertia Odometry and MappingThien-Minh Nguyen, Shenghai Yuan, Muqing Cao et al.
In this letter we investigate a tightly coupled Lidar-Inertia Odometry and Mapping (LIOM) scheme, with the capability to incorporate multiple lidars with complementary field of view (FOV). In essence, we devise a time-synchronized scheme to combine extracted features from separate lidars into a single pointcloud, which is then used to construct a local map and compute the feature-map matching (FMM) coefficients. These coefficients, along with the IMU preinteration observations, are then used to construct a factor graph that will be optimized to produce an estimate of the sliding window trajectory. We also propose a key frame-based map management strategy to marginalize certain poses and pointclouds in the sliding window to grow a global map, which is used to assemble the local map in the later stage. The use of multiple lidars with complementary FOV and the global map ensures that our estimate has low drift and can sustain good localization in situations where single lidar use gives poor result, or even fails to work. Multi-thread computation implementations are also adopted to fractionally cut down the computation time and ensure real-time performance. We demonstrate the efficacy of our system via a series of experiments on public datasets collected from an aerial vehicle.
RODec 28, 2020
SPINS: Structure Priors aided Inertial Navigation SystemYang Lyu, Thien-Minh Nguyen, Liu Liu et al.
Although Simultaneous Localization and Mapping (SLAM) has been an active research topic for decades, current state-of-the-art methods still suffer from instability or inaccuracy due to feature insufficiency or its inherent estimation drift, in many civilian environments. To resolve these issues, we propose a navigation system combing the SLAM and prior-map-based localization. Specifically, we consider additional integration of line and plane features, which are ubiquitous and more structurally salient in civilian environments, into the SLAM to ensure feature sufficiency and localization robustness. More importantly, we incorporate general prior map information into the SLAM to restrain its drift and improve the accuracy. To avoid rigorous association between prior information and local observations, we parameterize the prior knowledge as low dimensional structural priors defined as relative distances/angles between different geometric primitives. The localization is formulated as a graph-based optimization problem that contains sliding-window-based variables and factors, including IMU, heterogeneous features, and structure priors. We also derive the analytical expressions of Jacobians of different factors to avoid the automatic differentiation overhead. To further alleviate the computation burden of incorporating structural prior factors, a selection mechanism is adopted based on the so-called information gain to incorporate only the most effective structure priors in the graph optimization. Finally, the proposed framework is extensively tested on synthetic data, public datasets, and, more importantly, on the real UAV flight data obtained from a building inspection task. The results show that the proposed scheme can effectively improve the accuracy and robustness of localization for autonomous robots in civilian applications.
ROOct 25, 2020
LIRO: Tightly Coupled Lidar-Inertia-Ranging OdometryThien-Minh Nguyen, Muqing Cao, Shenghai Yuan et al.
In recent years, thanks to the continuously reduced cost and weight of 3D Lidar, the applications of this type of sensor in robotics community have become increasingly popular. Despite many progresses, estimation drift and tracking loss are still prevalent concerns associated with these systems. However, in theory these issues can be resolved with the use of some observations to fixed landmarks in the environments. This motivates us to investigate a tightly coupled sensor fusion scheme of Ultra-Wideband (UWB) range measurements with Lidar and inertia measurements. First, data from IMU, Lidar and UWB are associated with the robot's states on a sliding windows based on their timestamps. Then, we construct a cost function comprising of factors from UWB, Lidar and IMU preintegration measurements. Finally an optimization process is carried out to estimate the robot's position and orientation. Via some real world experiments, we show that the method can effectively resolve the drift issue, while only requiring two or three anchors deployed in the environment.
ROOct 23, 2020
VIRAL-Fusion: A Visual-Inertial-Ranging-Lidar Sensor Fusion ApproachThien-Minh Nguyen, Shenghai Yuan, Muqing Cao et al.
In recent years, Onboard Self Localization (OSL) methods based on cameras or Lidar have achieved many significant progresses. However, some issues such as estimation drift and feature-dependence still remain inherent limitations. On the other hand, infrastructure-based methods can generally overcome these issues, but at the expense of some installation cost. This poses an interesting problem of how to effectively combine these methods, so as to achieve localization with long-term consistency as well as flexibility compared to any single method. To this end, we propose a comprehensive optimization-based estimator for 15-dimensional state of an Unmanned Aerial Vehicle (UAV), fusing data from an extensive set of sensors: inertial measurement units (IMUs), Ultra-Wideband (UWB) ranging sensors, and multiple onboard Visual-Inertial and Lidar odometry subsystems. In essence, a sliding window is used to formulate a sequence of robot poses, where relative rotational and translational constraints between these poses are observed in the IMU preintegration and OSL observations, while orientation and position are coupled in body-offset UWB range observations. An optimization-based approach is developed to estimate the trajectory of the robot in this sliding window. We evaluate the performance of the proposed scheme in multiple scenarios, including experiments on public datasets, high-fidelity graphical-physical simulator, and field-collected data from UAV flight tests. The result demonstrates that our integrated localization method can effectively resolve the drift issue, while incurring minimal installation requirements.
ROFeb 28, 2018
Graph Optimization Approach to Range-based LocalizationXu Fang, Chen Wang, Thien-Minh Nguyen et al.
In this paper, we propose a general graph optimization based framework for localization, which can accommodate different types of measurements with varying measurement time intervals. Special emphasis will be on range-based localization. Range and trajectory smoothness constraints are constructed in a position graph, then the robot trajectory over a sliding window is estimated by a graph based optimization algorithm. Moreover, convergence analysis of the algorithm is provided, and the effects of the number of iterations and window size in the optimization on the localization accuracy are analyzed. Extensive experiments on quadcopter under a variety of scenarios verify the effectiveness of the proposed algorithm and demonstrate a much higher localization accuracy than the existing range-based localization methods, especially in the altitude direction.
SYFeb 25, 2018
Robust Target-relative Localization with Ultra-Wideband Ranging and CommunicationThien-Minh Nguyen, Abdul Hanif Zaini, Chen Wang et al.
In this paper we propose a method to achieve relative positioning and tracking of a target by a quadcopter using Ultra-wideband (UWB) ranging sensors, which are strategically installed to help retrieve both relative position and bearing between the quadcopter and target. To achieve robust localization for autonomous flight even with uncertainty in the speed of the target, two main features are developed. First, an estimator based on Extended Kalman Filter (EKF) is developed to fuse UWB ranging measurements with data from onboard sensors including inertial measurement unit (IMU), altimeters and optical flow. Second, to properly handle the coupling of the target's orientation with the range measurements, UWB based communication capability is utilized to transfer the target's orientation to the quadcopter. Experiment results demonstrate the ability of the quadcopter to control its position relative to the target autonomously in both cases when the target is static and moving.
ROFeb 20, 2018
Correlation Flow: Robust Optical Flow Using Kernel Cross-CorrelatorsChen Wang, Tete Ji, Thien-Minh Nguyen et al.
Robust velocity and position estimation is crucial for autonomous robot navigation. The optical flow based methods for autonomous navigation have been receiving increasing attentions in tandem with the development of micro unmanned aerial vehicles. This paper proposes a kernel cross-correlator (KCC) based algorithm to determine optical flow using a monocular camera, which is named as correlation flow (CF). Correlation flow is able to provide reliable and accurate velocity estimation and is robust to motion blur. In addition, it can also estimate the altitude velocity and yaw rate, which are not available by traditional methods. Autonomous flight tests on a quadcopter show that correlation flow can provide robust trajectory estimation with very low processing power. The source codes are released based on the ROS framework.
ROSep 30, 2017
Ultra-Wideband Aided Fast Localization and Mapping SystemChen Wang, Handuo Zhang, Thien-Minh Nguyen et al.
This paper proposes an ultra-wideband (UWB) aided localization and mapping system that leverages on inertial sensor and depth camera. Inspired by the fact that visual odometry (VO) system, regardless of its accuracy in the short term, still faces challenges with accumulated errors in the long run or under unfavourable environments, the UWB ranging measurements are fused to remove the visual drift and improve the robustness. A general framework is developed which consists of three parallel threads, two of which carry out the visual-inertial odometry (VIO) and UWB localization respectively. The other mapping thread integrates visual tracking constraints into a pose graph with the proposed smooth and virtual range constraints, such that an optimization is performed to provide robust trajectory estimation. Experiments show that the proposed system is able to create dense drift-free maps in real-time even running on an ultra-low power processor in featureless environments.