ROOct 15, 2025Code
Opti-Acoustic Scene Reconstruction in Highly Turbid Underwater EnvironmentsIvana Collado-Gonzalez, John McConnell, Paul Szenher et al.
Scene reconstruction is an essential capability for underwater robots navigating in close proximity to structures. Monocular vision-based reconstruction methods are unreliable in turbid waters and lack depth scale information. Sonars are robust to turbid water and non-uniform lighting conditions, however, they have low resolution and elevation ambiguity. This work proposes a real-time opti-acoustic scene reconstruction method that is specially optimized to work in turbid water. Our strategy avoids having to identify point features in visual data and instead identifies regions of interest in the data. We then match relevant regions in the image to corresponding sonar data. A reconstruction is obtained by leveraging range data from the sonar and elevation data from the camera image. Experimental comparisons against other vision-based and sonar-based approaches at varying turbidity levels, and field tests conducted in marina environments, validate the effectiveness of the proposed approach. We have made our code open-source to facilitate reproducibility and encourage community engagement.
46.4ROMar 15Code
Towards Versatile Opti-Acoustic Sensor Fusion and Volumetric MappingIvana Collado-Gonzalez, John McConnell, Brendan Englot
Accurate 3D volumetric mapping is critical for autonomous underwater vehicles operating in obstacle-rich environments. Vision-based perception provides high-resolution data but fails in turbid conditions, while sonar is robust to lighting and turbidity but suffers from low resolution and elevation ambiguity. This paper presents a volumetric mapping framework that fuses a stereo sonar pair with a monocular camera to enable safe navigation under varying visibility conditions. Overlapping sonar fields of view resolve elevation ambiguity, producing fully defined 3D point clouds at each time step. The framework identifies regions of interest in camera images, associates them with corresponding sonar returns, and combines sonar range with camera-derived elevation cues to generate additional 3D points. Each 3D point is assigned a confidence value reflecting its reliability. These confidence-weighted points are fused using a Gaussian Process Volumetric Mapping framework that prioritizes the most reliable measurements. Experimental comparisons with other opti-acoustic and sonar-based approaches, along with field tests in a marina environment, demonstrate the method's effectiveness in capturing complex geometries and preserving critical information for robot navigation in both clear and turbid conditions. Our code is open-source to support community adoption.
10.2ROMay 19
Multi-Session Ground Texture SLAM in Low-Dynamic EnvironmentsKyle M. Hart, Brendan Englot
The simultaneous localization and mapping community has introduced a growing number of systems adapted for multi-session operations where the operational environment features low-dynamic changes that impact mapping, such as surface wear, weather phenomena, or seasonal change. These systems allow for lifelong operations by a robot within these environments. There is also growing interest in operations in environments where the unique ground texture is the only mapping feature available for use. These ground texture systems are not yet targeted for multi-session low-dynamic-change environments though. This work explores the impact of three different techniques on trajectory estimation accuracy in these multi-session low-dynamic ground texture environments. Of the three, the use of Kullback-Leibler Divergence, as a similarity score and a bias influencing loop closure confidence, is found to have the most success. We show an analysis of all three methods and a deeper exploration of the impact of Kullback-Leibler Divergence. We also introduce a dataset for use by the robotics community that contains multi-session images where the ground changes between sessions and also high-accuracy pose information for use in evaluation.
13.3ROMar 24
Variable-Resolution Virtual Maps for Autonomous Exploration with Unmanned Surface Vehicles (USVs)Ye Li, Yewei Huang, Wenlong GaoZhang et al.
Autonomous exploration by unmanned surface vehicles (USVs) in near-shore waters requires reliable localisation and consistent mapping over extended areas, but this is challenged by GNSS degradation, environment-induced localisation uncertainty, and limited on-board computation. Virtual map-based methods explicitly model localisation and mapping uncertainty by tightly coupling factor-graph SLAM with a map uncertainty criterion. However, their storage and computational costs scale poorly with fixed-resolution workspace discretisations, leading to inefficiency in large near-shore environments. Moreover, overvaluing feature-sparse open-water regions can increase the risk of SLAM failure as a result of imbalance between exploration and exploitation. To address these limitations, we propose a Variable-Resolution Virtual Map (VRVM), a computationally efficient method for representing map uncertainty using bivariate Gaussian virtual landmarks placed in the cells of an adaptive quadtree. The adaptive quadtree enables an area-weighted uncertainty representation that keeps coarse, far-field virtual landmarks deliberately uncertain while allocating higher resolution to information-dense regions, and reduces the sensitivity of the map valuation to local refinements of the tree. An expectation-maximisation (EM) planner is adopted to evaluate pose and map uncertainty along frontiers using the VRVM, balancing exploration and exploitation. We evaluate VRVM against several state-of-the-art exploration algorithms in the VRX Gazebo simulator, using a realistic marina environment across different testing scenarios with an increasing level of exploration difficulty. The results indicate that our method offers safer behaviour and better utilisation of on-board computation in GNSS-degraded near-shore environments.
ROApr 22, 2021Code
LVI-SAM: Tightly-coupled Lidar-Visual-Inertial Odometry via Smoothing and MappingTixiao Shan, Brendan Englot, Carlo Ratti et al.
We propose a framework for tightly-coupled lidar-visual-inertial odometry via smoothing and mapping, LVI-SAM, that achieves real-time state estimation and map-building with high accuracy and robustness. LVI-SAM is built atop a factor graph and is composed of two sub-systems: a visual-inertial system (VIS) and a lidar-inertial system (LIS). The two sub-systems are designed in a tightly-coupled manner, in which the VIS leverages LIS estimation to facilitate initialization. The accuracy of the VIS is improved by extracting depth information for visual features using lidar measurements. In turn, the LIS utilizes VIS estimation for initial guesses to support scan-matching. Loop closures are first identified by the VIS and further refined by the LIS. LVI-SAM can also function when one of the two sub-systems fails, which increases its robustness in both texture-less and feature-less environments. LVI-SAM is extensively evaluated on datasets gathered from several platforms over a variety of scales and environments. Our implementation is available at https://git.io/lvi-sam
CVAug 3, 2025
CVD-SfM: A Cross-View Deep Front-end Structure-from-Motion System for Sparse Localization in Multi-Altitude ScenesYaxuan Li, Yewei Huang, Bijay Gaudel et al.
We present a novel multi-altitude camera pose estimation system, addressing the challenges of robust and accurate localization across varied altitudes when only considering sparse image input. The system effectively handles diverse environmental conditions and viewpoint variations by integrating the cross-view transformer, deep features, and structure-from-motion into a unified framework. To benchmark our method and foster further research, we introduce two newly collected datasets specifically tailored for multi-altitude camera pose estimation; datasets of this nature remain rare in the current literature. The proposed framework has been validated through extensive comparative analyses on these datasets, demonstrating that our system achieves superior performance in both accuracy and robustness for multi-altitude sparse pose estimation tasks compared to existing solutions, making it well suited for real-world robotic applications such as aerial navigation, search and rescue, and automated inspection.
ROFeb 16, 2022
Virtual Maps for Autonomous Exploration of Cluttered Underwater EnvironmentsJinkun Wang, Fanfei Chen, Yewei Huang et al.
We consider the problem of autonomous mobile robot exploration in an unknown environment, taking into account a robot's coverage rate, map uncertainty, and state estimation uncertainty. This paper presents a novel exploration framework for underwater robots operating in cluttered environments, built upon simultaneous localization and mapping (SLAM) with imaging sonar. The proposed system comprises path generation, place recognition forecasting, belief propagation and utility evaluation using a virtual map, which estimates the uncertainty associated with map cells throughout a robot's workspace. We evaluate the performance of this framework in simulated experiments, showing that our algorithm maintains a high coverage rate during exploration while also maintaining low mapping and localization error. The real-world applicability of our framework is also demonstrated on an underwater remotely operated vehicle (ROV) exploring a harbor environment.
ROFeb 11, 2022
Overhead Image Factors for Underwater Sonar-based SLAMJohn McConnell, Fanfei Chen, Brendan Englot
Simultaneous localization and mapping (SLAM) is a critical capability for any autonomous underwater vehicle (AUV). However, robust, accurate state estimation is still a work in progress when using low-cost sensors. We propose enhancing a typical low-cost sensor package using widely available and often free prior information; overhead imagery. Given an AUV's sonar image and a partially overlapping, globally-referenced overhead image, we propose using a convolutional neural network (CNN) to generate a synthetic overhead image predicting the above-surface appearance of the sonar image contents. We then use this synthetic overhead image to register our observations to the provided global overhead image. Once registered, the transformation is introduced as a factor into a pose SLAM factor graph. We use a state-of-the-art simulation environment to perform validation over a series of benchmark trajectories and quantitatively show the improved accuracy of robot state estimation using the proposed approach. We also show qualitative outcomes from a real AUV field deployment. Video attachment: https://youtu.be/_uWljtp58ks
ROMay 11, 2021
Zero-Shot Reinforcement Learning on Graphs for Autonomous Exploration Under UncertaintyFanfei Chen, Paul Szenher, Yewei Huang et al.
This paper studies the problem of autonomous exploration under localization uncertainty for a mobile robot with 3D range sensing. We present a framework for self-learning a high-performance exploration policy in a single simulation environment, and transferring it to other environments, which may be physical or virtual. Recent work in transfer learning achieves encouraging performance by domain adaptation and domain randomization to expose an agent to scenarios that fill the inherent gaps in sim2sim and sim2real approaches. However, it is inefficient to train an agent in environments with randomized conditions to learn the important features of its current state. An agent can use domain knowledge provided by human experts to learn efficiently. We propose a novel approach that uses graph neural networks in conjunction with deep reinforcement learning, enabling decision-making over graphs containing relevant exploration information provided by human experts to predict a robot's optimal sensing action in belief space. The policy, which is trained only in a single simulation environment, offers a real-time, scalable, and transferable decision-making strategy, resulting in zero-shot transfer to other simulation environments and even real-world environments.
ROApr 7, 2021
Predictive 3D Sonar Mapping of Underwater Environments via Object-specific Bayesian InferenceJohn McConnell, Brendan Englot
Recent work has achieved dense 3D reconstruction with wide-aperture imaging sonar using a stereo pair of orthogonally oriented sonars. This allows each sonar to observe a spatial dimension that the other is missing, without requiring any prior assumptions about scene geometry. However, this is achieved only in a small region with overlapping fields-of-view, leaving large regions of sonar image observations with an unknown elevation angle. Our work aims to achieve large-scale 3D reconstruction more efficiently using this sensor arrangement. We propose dividing the world into semantic classes to exploit the presence of repeating structures in the subsea environment. We use a Bayesian inference framework to build an understanding of each object class's geometry when 3D information is available from the orthogonal sonar fusion system, and when the elevation angle of our returns is unknown, our framework is used to infer unknown 3D structure. We quantitatively validate our method in a simulation and use data collected from a real outdoor littoral environment to demonstrate the efficacy of our framework in the field. Video attachment: https://www.youtube.com/watch?v=WouCrY9eK4o&t=75s
LGApr 5, 2021
Fast Design Space Exploration of Nonlinear Systems: Part ISanjai Narain, Emily Mak, Dana Chee et al.
System design tools are often only available as input-output blackboxes: for a given design as input they compute an output representing system behavior. Blackboxes are intended to be run in the forward direction. This paper presents a new method of solving the inverse design problem namely, given requirements or constraints on output, find an input that also optimizes an objective function. This problem is challenging for several reasons. First, blackboxes are not designed to be run in reverse. Second, inputs and outputs can be discrete and continuous. Third, finding designs concurrently satisfying a set of requirements is hard because designs satisfying individual requirements may conflict with each other. Fourth, blackbox evaluations can be expensive. Finally, blackboxes can sometimes fail to produce an output. This paper presents CNMA, a new method of solving the inverse problem that overcomes these challenges. CNMA tries to sample only the part of the design space relevant to solving the problem, leveraging the power of neural networks, Mixed Integer Linear Programs, and a new learning-from-failure feedback loop. The paper also presents a parallel version of CNMA that improves the efficiency and quality of solutions over the sequential version, and tries to steer it away from local optima. CNMA's performance is evaluated against conventional optimization methods for seven nonlinear design problems of 8 (two problems), 10, 15, 36 and 60 real-valued dimensions and one with 186 binary dimensions. Conventional methods evaluated are off-the-shelf implementations of Bayesian Optimization with Gaussian Processes, Nelder Mead and Random Search. The first two do not solve problems that are high-dimensional, have discrete and continuous variables or whose blackboxes can fail to return values. CNMA solves all problems, and surpasses the performance of conventional methods by up to 87%.
CVMar 3, 2021
Robust Place Recognition using an Imaging LidarTixiao Shan, Brendan Englot, Fabio Duarte et al.
We propose a methodology for robust, real-time place recognition using an imaging lidar, which yields image-quality high-resolution 3D point clouds. Utilizing the intensity readings of an imaging lidar, we project the point cloud and obtain an intensity image. ORB feature descriptors are extracted from the image and encoded into a bag-of-words vector. The vector, used to identify the point cloud, is inserted into a database that is maintained by DBoW for fast place recognition queries. The returned candidate is further validated by matching visual feature descriptors. To reject matching outliers, we apply PnP, which minimizes the reprojection error of visual features' positions in Euclidean space with their correspondences in 2D image space, using RANSAC. Combining the advantages from both camera and lidar-based place recognition approaches, our method is truly rotation-invariant, and can tackle reverse revisiting and upside down revisiting. The proposed method is evaluated on datasets gathered from a variety of platforms over different scales and environments. Our implementation and datasets are available at https://git.io/image-lidar
AIOct 19, 2020
Robot Design With Neural Networks, MILP Solvers and Active LearningSanjai Narain, Emily Mak, Dana Chee et al.
Central to the design of many robot systems and their controllers is solving a constrained blackbox optimization problem. This paper presents CNMA, a new method of solving this problem that is conservative in the number of potentially expensive blackbox function evaluations; allows specifying complex, even recursive constraints directly rather than as hard-to-design penalty or barrier functions; and is resilient to the non-termination of function evaluations. CNMA leverages the ability of neural networks to approximate any continuous function, their transformation into equivalent mixed integer linear programs (MILPs) and their optimization subject to constraints with industrial strength MILP solvers. A new learning-from-failure step guides the learning to be relevant to solving the constrained optimization problem. Thus, the amount of learning is orders of magnitude smaller than that needed to learn functions over their entire domains. CNMA is illustrated with the design of several robotic systems: wave-energy propelled boat, lunar lander, hexapod, cartpole, acrobot and parallel parking. These range from 6 real-valued dimensions to 36. We show that CNMA surpasses the Nelder-Mead, Gaussian and Random Search optimization methods against the metric of number of function evaluations.
ROAug 2, 2020
Variational Filtering with Copula Models for SLAMJohn D. Martin, Kevin Doherty, Caralyn Cyr et al.
The ability to infer map variables and estimate pose is crucial to the operation of autonomous mobile robots. In most cases the shared dependency between these variables is modeled through a multivariate Gaussian distribution, but there are many situations where that assumption is unrealistic. Our paper shows how it is possible to relax this assumption and perform simultaneous localization and mapping (SLAM) with a larger class of distributions, whose multivariate dependency is represented with a copula model. We integrate the distribution model with copulas into a Sequential Monte Carlo estimator and show how unknown model parameters can be learned through gradient-based optimization. We demonstrate our approach is effective in settings where Gaussian assumptions are clearly violated, such as environments with uncertain data association and nonlinear transition models.
ROJul 24, 2020
Autonomous Exploration Under Uncertainty via Deep Reinforcement Learning on GraphsFanfei Chen, John D. Martin, Yewei Huang et al.
We consider an autonomous exploration problem in which a range-sensing mobile robot is tasked with accurately mapping the landmarks in an a priori unknown environment efficiently in real-time; it must choose sensing actions that both curb localization uncertainty and achieve information gain. For this problem, belief space planning methods that forward-simulate robot sensing and estimation may often fail in real-time implementation, scaling poorly with increasing size of the state, belief and action spaces. We propose a novel approach that uses graph neural networks (GNNs) in conjunction with deep reinforcement learning (DRL), enabling decision-making over graphs containing exploration information to predict a robot's optimal sensing action in belief space. The policy, which is trained in different random environments without human intervention, offers a real-time, scalable decision-making process whose high-performance exploratory sensing actions yield accurate maps and high rates of information gain.
ROJul 20, 2020
Fusing Concurrent Orthogonal Wide-aperture Sonar Images for Dense Underwater 3D ReconstructionJohn McConnell, John D. Martin, Brendan Englot
We propose a novel approach to handling the ambiguity in elevation angle associated with the observations of a forward looking multi-beam imaging sonar, and the challenges it poses for performing an accurate 3D reconstruction. We utilize a pair of sonars with orthogonal axes of uncertainty to independently observe the same points in the environment from two different perspectives, and associate these observations. Using these concurrent observations, we can create a dense, fully defined point cloud at every time-step to aid in reconstructing the 3D geometry of underwater scenes. We will evaluate our method in the context of the current state of the art, for which strong assumptions on object geometry limit applicability to generalized 3D scenes. We will discuss results from laboratory tests that quantitatively benchmark our algorithm's reconstruction capabilities, and results from a real-world, tidal river basin which qualitatively demonstrate our ability to reconstruct a cluttered field of underwater objects.
ROJul 16, 2020
A Receding Horizon Multi-Objective Planner for Autonomous Surface Vehicles in Urban WaterwaysTixiao Shan, Wei Wang, Brendan Englot et al.
We propose a novel receding horizon planner for an autonomous surface vehicle (ASV) performing path planning in urban waterways. Feasible paths are found by repeatedly generating and searching a graph reflecting the obstacles observed in the sensor field-of-view. We also propose a novel method for multi-objective motion planning over the graph by leveraging the paradigm of lexicographic optimization and applying it to graph search within our receding horizon planner. The competing resources of interest are penalized hierarchically during the search. Higher-ranked resources cause a robot to incur non-negative costs over the paths traveled, which are occasionally zero-valued. The framework is intended to capture problems in which a robot must manage resources such as risk of collision. This leaves freedom for tie-breaking with respect to lower-priority resources; at the bottom of the hierarchy is a strictly positive quantity consumed by the robot, such as distance traveled, energy expended or time elapsed. We conduct experiments in both simulated and real-world environments to validate the proposed planner and demonstrate its capability for enabling ASV navigation in complex environments.
ROJul 1, 2020
LIO-SAM: Tightly-coupled Lidar Inertial Odometry via Smoothing and MappingTixiao Shan, Brendan Englot, Drew Meyers et al.
We propose a framework for tightly-coupled lidar inertial odometry via smoothing and mapping, LIO-SAM, that achieves highly accurate, real-time mobile robot trajectory estimation and map-building. LIO-SAM formulates lidar-inertial odometry atop a factor graph, allowing a multitude of relative and absolute measurements, including loop closures, to be incorporated from different sources as factors into the system. The estimated motion from inertial measurement unit (IMU) pre-integration de-skews point clouds and produces an initial guess for lidar odometry optimization. The obtained lidar odometry solution is used to estimate the bias of the IMU. To ensure high performance in real-time, we marginalize old lidar scans for pose optimization, rather than matching lidar scans to a global map. Scan-matching at a local scale instead of a global scale significantly improves the real-time performance of the system, as does the selective introduction of keyframes, and an efficient sliding window approach that registers a new keyframe to a fixed-size set of prior ``sub-keyframes.'' The proposed method is extensively evaluated on datasets gathered from three platforms over various scales and environments.
ROApr 10, 2020
Simulation-based Lidar Super-resolution for Ground VehiclesTixiao Shan, Jinkun Wang, Fanfei Chen et al.
We propose a methodology for lidar super-resolution with ground vehicles driving on roadways, which relies completely on a driving simulator to enhance, via deep learning, the apparent resolution of a physical lidar. To increase the resolution of the point cloud captured by a sparse 3D lidar, we convert this problem from 3D Euclidean space into an image super-resolution problem in 2D image space, which is solved using a deep convolutional neural network. By projecting a point cloud onto a range image, we are able to efficiently enhance the resolution of such an image using a deep neural network. Typically, the training of a deep neural network requires vast real-world data. Our approach does not require any real-world data, as we train the network purely using computer-generated data. Thus our method is applicable to the enhancement of any type of 3D lidar theoretically. By novelly applying Monte-Carlo dropout in the network and removing the predictions with high uncertainty, our method produces high accuracy point clouds comparable with the observations of a real high resolution lidar. We present experimental results applying our method to several simulated and real-world datasets. We argue for the method's potential benefits in real-world robotics applications such as occupancy mapping and terrain modeling.
ROSep 5, 2019
A Lexicographic Search Method for Multi-Objective Motion PlanningTixiao Shan, Brendan Englot
We propose a novel method for multi-objective motion planning problems by leveraging the paradigm of lexicographic optimization and applying it for the first time to graph search over probabilistic roadmaps. The competing resources of interest are penalized hierarchically during the search. Higher-ranked resources cause a robot to incur non-negative costs over the paths traveled, which are occasionally zero-valued. This is intended to capture problems in which a robot must manage resources such as visibility of threats, availability of communications, and access to valuable measurements. This leaves freedom for tie-breaking with respect to lower-priority resources; at the bottom of the hierarchy is a strictly positive quantity consumed by the robot, such as distance traveled, energy expended or time elapsed. We compare our method with two other multi-objective approaches, a naive weighted sum method and an expanded graph search method, demonstrating that a lexicographic search can solve such planning problems efficiently without a need for parameter-tuning in unintuitive units. The proposed method is also demonstrated on hardware using a laser-equipped ground robot.
LGMay 17, 2019
Stochastically Dominant Distributional Reinforcement LearningJohn D. Martin, Michal Lyskawinski, Xiaohu Li et al.
We describe a new approach for managing aleatoric uncertainty in the Reinforcement Learning (RL) paradigm. Instead of selecting actions according to a single statistic, we propose a distributional method based on the second-order stochastic dominance (SSD) relation. This compares the inherent dispersion of random returns induced by actions, producing a more comprehensive and robust evaluation of the environment's uncertainty. The necessary conditions for SSD require estimators to predict accurate second moments. To accommodate this, we map the distributional RL problem to a Wasserstein gradient flow, treating the distributional Bellman residual as a potential energy functional. We propose a particle-based algorithm for which we prove optimality and convergence. Our experiments characterize the algorithm performance and demonstrate how uncertainty and performance are better balanced using an \textsc{ssd} policy than with other risk measures.
LGNov 17, 2018
Recursive Sparse Pseudo-input Gaussian Process SARSAJohn Martin, Brendan Englot
The class of Gaussian Process (GP) methods for Temporal Difference learning has shown promise for data-efficient model-free Reinforcement Learning. In this paper, we consider a recent variant of the GP-SARSA algorithm, called Sparse Pseudo-input Gaussian Process SARSA (SPGP-SARSA), and derive recursive formulas for its predictive moments. This extension promotes greater memory efficiency, since previous computations can be reused and, interestingly, it provides a technique for updating value estimates on a multiple timescales
LGOct 2, 2018
Sparse Gaussian Process Temporal Difference Learning for Marine Robot NavigationJohn Martin, Jinkun Wang, Brendan Englot
We present a method for Temporal Difference (TD) learning that addresses several challenges faced by robots learning to navigate in a marine environment. For improved data efficiency, our method reduces TD updates to Gaussian Process regression. To make predictions amenable to online settings, we introduce a sparse approximation with improved quality over current rejection-based sparse methods. We derive the predictive value function posterior and use the moments to obtain a new algorithm for model-free policy evaluation, SPGP-SARSA. With simple changes, we show SPGP-SARSA can be reduced to a model-based equivalent, SPGP-TD. We perform comprehensive simulation studies and also conduct physical learning trials with an underwater robot. Our results show SPGP-SARSA can outperform the state-of-the-art sparse method, replicate the prediction quality of its exact counterpart, and be applied to solve underwater navigation tasks.
RONov 4, 2015
A bi-criteria path planning algorithm for robotics applicationsZachary Clawson, Xuchu Ding, Brendan Englot et al.
Realistic path planning applications often require optimizing with respect to several criteria simultaneously. Here we introduce an efficient algorithm for bi-criteria path planning on graphs. Our approach is based on augmenting the state space to keep track of the "budget" remaining to satisfy the constraints on secondary cost. The resulting augmented graph is acyclic and the primary cost can be then minimized by a simple upward sweep through budget levels. The efficiency and accuracy of our algorithm is tested on Probabilistic Roadmap graphs to minimize the distance of travel subject to a constraint on the overall threat exposure of the robot. We also present the results from field experiments illustrating the use of this approach on realistic robotic systems.