Jonathan P. How

RO
h-index64
124papers
6,500citations
Novelty56%
AI Score60

124 Papers

ROJun 29, 2023
Principles and Guidelines for Evaluating Social Robot Navigation Algorithms

Anthony Francis, Claudia Pérez-D'Arpino, Chengshu Li et al. · cmu, mit

A major challenge to deploying robots widely is navigation in human-populated environments, commonly referred to as social robot navigation. While the field of social navigation has advanced tremendously in recent years, the fair evaluation of algorithms that tackle social navigation remains hard because it involves not just robotic agents moving in static environments but also dynamic human agents and their perceptions of the appropriateness of robot behavior. In contrast, clear, repeatable, and accessible benchmarks have accelerated progress in fields like computer vision, natural language processing and traditional robot navigation by enabling researchers to fairly compare algorithms, revealing limitations of existing solutions and illuminating promising new directions. We believe the same approach can benefit social navigation. In this paper, we pave the road towards common, widely accessible, and repeatable benchmarking criteria to evaluate social robot navigation. Our contributions include (a) a definition of a socially navigating robot as one that respects the principles of safety, comfort, legibility, politeness, social competency, agent understanding, proactivity, and responsiveness to context, (b) guidelines for the use of metrics, development of scenarios, benchmarks, datasets, and simulators to evaluate social navigation, and (c) a design of a social navigation metrics framework to make it easier to compare results from different simulators, robots and datasets.

ROMar 25, 2022
Risk-Aware Off-Road Navigation via a Learned Speed Distribution Map

Xiaoyi Cai, Michael Everett, Jonathan Fink et al. · mit

Motion planning in off-road environments requires reasoning about both the geometry and semantics of the scene (e.g., a robot may be able to drive through soft bushes but not a fallen log). In many recent works, the world is classified into a finite number of semantic categories that often are not sufficient to capture the ability (i.e., the speed) with which a robot can traverse off-road terrain. Instead, this work proposes a new representation of traversability based exclusively on robot speed that can be learned from data, offers interpretability and intuitive tuning, and can be easily integrated with a variety of planning paradigms in the form of a costmap. Specifically, given a dataset of experienced trajectories, the proposed algorithm learns to predict a distribution of speeds the robot could achieve, conditioned on the environment semantics and commanded speed. The learned speed distribution map is converted into costmaps with a risk-aware cost term based on conditional value at risk (CVaR). Numerical simulations demonstrate that the proposed risk-aware planning algorithm leads to faster average time-to-goals compared to a method that only considers expected behavior, and the planner can be tuned for slightly slower, but less variable behavior. Furthermore, the approach is integrated into a full autonomy stack and demonstrated in a high-fidelity Unity environment and is shown to provide a 30\% improvement in the success rate of navigation.

SYSep 28, 2022
Backward Reachability Analysis of Neural Feedback Loops: Techniques for Linear and Nonlinear Systems

Nicholas Rober, Sydney M. Katz, Chelsea Sidrane et al. · mit

As neural networks (NNs) become more prevalent in safety-critical applications such as control of vehicles, there is a growing need to certify that systems with NN components are safe. This paper presents a set of backward reachability approaches for safety certification of neural feedback loops (NFLs), i.e., closed-loop systems with NN control policies. While backward reachability strategies have been developed for systems without NN components, the nonlinearities in NN activation functions and general noninvertibility of NN weight matrices make backward reachability for NFLs a challenging problem. To avoid the difficulties associated with propagating sets backward through NNs, we introduce a framework that leverages standard forward NN analysis tools to efficiently find over-approximations to backprojection (BP) sets, i.e., sets of states for which an NN policy will lead a system to a given target set. We present frameworks for calculating BP over approximations for both linear and nonlinear systems with control policies represented by feedforward NNs and propose computationally efficient strategies. We use numerical results from a variety of models to showcase the proposed algorithms, including a demonstration of safety certification for a 6D system.

SYApr 14, 2022
Backward Reachability Analysis for Neural Feedback Loops

Nicholas Rober, Michael Everett, Jonathan P. How · mit

The increasing prevalence of neural networks (NNs) in safety-critical applications calls for methods to certify their behavior and guarantee safety. This paper presents a backward reachability approach for safety verification of neural feedback loops (NFLs), i.e., closed-loop systems with NN control policies. While recent works have focused on forward reachability as a strategy for safety certification of NFLs, backward reachability offers advantages over the forward strategy, particularly in obstacle avoidance scenarios. Prior works have developed techniques for backward reachability analysis for systems without NNs, but the presence of NNs in the feedback loop presents a unique set of problems due to the nonlinearities in their activation functions and because NN models are generally not invertible. To overcome these challenges, we use existing forward NN analysis tools to find affine bounds on the control inputs and solve a series of linear programs (LPs) to efficiently find an approximation of the backprojection (BP) set, i.e., the set of states for which the NN control policy will drive the system to a given target set. We present an algorithm to iteratively find BP set estimates over a given time horizon and demonstrate the ability to reduce conservativeness in the BP set estimates by up to 88% with low additional computational cost. We use numerical results from a double integrator model to verify the efficacy of these algorithms and demonstrate the ability to certify safety for a linearized ground robot model in a collision avoidance scenario where forward reachability fails.

SYOct 14, 2022
A Hybrid Partitioning Strategy for Backward Reachability of Neural Feedback Loops

Nicholas Rober, Michael Everett, Songan Zhang et al. · mit

As neural networks become more integrated into the systems that we depend on for transportation, medicine, and security, it becomes increasingly important that we develop methods to analyze their behavior to ensure that they are safe to use within these contexts. The methods used in this paper seek to certify safety for closed-loop systems with neural network controllers, i.e., neural feedback loops, using backward reachability analysis. Namely, we calculate backprojection (BP) set over-approximations (BPOAs), i.e., sets of states that lead to a given target set that bounds dangerous regions of the state space. The system's safety can then be certified by checking its current state against the BPOAs. While over-approximating BPs is significantly faster than calculating exact BP sets, solving the relaxed problem leads to conservativeness. To combat conservativeness, partitioning strategies can be used to split the problem into a set of sub-problems, each less conservative than the unpartitioned problem. We introduce a hybrid partitioning method that uses both target set partitioning (TSP) and backreachable set partitioning (BRSP) to overcome a lower bound on estimation error that is present when using BRSP. Numerical results demonstrate a near order-of-magnitude reduction in estimation error compared to BRSP or TSP given the same computation time.

LGMar 7, 2022
Influencing Long-Term Behavior in Multiagent Reinforcement Learning

Dong-Ki Kim, Matthew Riemer, Miao Liu et al. · mit

The main challenge of multiagent reinforcement learning is the difficulty of learning useful policies in the presence of other simultaneously learning agents whose changing behaviors jointly affect the environment's transition and reward dynamics. An effective approach that has recently emerged for addressing this non-stationarity is for each agent to anticipate the learning of other agents and influence the evolution of future policies towards desirable behavior for its own benefit. Unfortunately, previous approaches for achieving this suffer from myopic evaluation, considering only a finite number of policy updates. As such, these methods can only influence transient future policies rather than achieving the promise of scalable equilibrium selection approaches that influence the behavior at convergence. In this paper, we propose a principled framework for considering the limiting policies of other agents as time approaches infinity. Specifically, we develop a new optimization objective that maximizes each agent's average reward by directly accounting for the impact of its behavior on the limiting set of policies that other agents will converge to. Our paper characterizes desirable solution concepts within this problem setting and provides practical approaches for optimizing over possible outcomes. As a result of our farsighted objective, we demonstrate better long-term performance than state-of-the-art baselines across a suite of diverse multiagent benchmark domains.

LGJul 17, 2023
RAYEN: Imposition of Hard Convex Constraints on Neural Networks

Jesus Tordesillas, Jonathan P. How, Marco Hutter · eth-zurich

This paper presents RAYEN, a framework to impose hard convex constraints on the output or latent variable of a neural network. RAYEN guarantees that, for any input or any weights of the network, the constraints are satisfied at all times. Compared to other approaches, RAYEN does not perform a computationally-expensive orthogonal projection step onto the feasible set, does not rely on soft constraints (which do not guarantee the satisfaction of the constraints at test time), does not use conservative approximations of the feasible set, and does not perform a potentially slow inner gradient descent correction to enforce the constraints. RAYEN supports any combination of linear, convex quadratic, second-order cone (SOC), and linear matrix inequality (LMI) constraints, achieving a very small computational overhead compared to unconstrained networks. For example, it is able to impose 1K quadratic constraints on a 1K-dimensional variable with an overhead of less than 8 ms, and an LMI constraint with 300x300 dense matrices on a 10K-dimensional variable in less than 12 ms. When used in neural networks that approximate the solution of constrained optimization problems, RAYEN achieves computation times between 20 and 7468 times faster than state-of-the-art algorithms, while guaranteeing the satisfaction of the constraints at all times and obtaining a cost very close to the optimal one.

CVOct 15, 2022Code
MIXER: Multiattribute, Multiway Fusion of Uncertain Pairwise Affinities

Parker C. Lusk, Kaveh Fathian, Jonathan P. How

We present a multiway fusion algorithm capable of directly processing uncertain pairwise affinities. In contrast to existing works that require initial pairwise associations, our MIXER algorithm improves accuracy by leveraging the additional information provided by pairwise affinities. Our main contribution is a multiway fusion formulation that is particularly suited to processing non-binary affinities and a novel continuous relaxation whose solutions are guaranteed to be binary, thus avoiding the typical, but potentially problematic, solution binarization steps that may cause infeasibility. A crucial insight of our formulation is that it allows for three modes of association, ranging from non-match, undecided, and match. Exploiting this insight allows fusion to be delayed for some data pairs until more information is available, which is an effective feature for fusion of data with multiple attributes/information sources. We evaluate MIXER on typical synthetic data and benchmark datasets and show increased accuracy against the state of the art in multiway matching, especially in noisy regimes with low observation redundancy. Additionally, we collect RGB data of cars in a parking lot to demonstrate MIXER's ability to fuse data having multiple attributes (color, visual appearance, and bounding box). On this challenging dataset, MIXER achieves 74% F1 accuracy and is 49x faster than the next best algorithm, which has 42% accuracy. Open source code is available at https://github.com/mit-acl/mixer.

RONov 10, 2023
EVORA: Deep Evidential Traversability Learning for Risk-Aware Off-Road Autonomy

Xiaoyi Cai, Siddharth Ancha, Lakshay Sharma et al. · mit

Traversing terrain with good traction is crucial for achieving fast off-road navigation. Instead of manually designing costs based on terrain features, existing methods learn terrain properties directly from data via self-supervision to automatically penalize trajectories moving through undesirable terrain, but challenges remain to properly quantify and mitigate the risk due to uncertainty in learned models. To this end, this work proposes a unified framework to learn uncertainty-aware traction model and plan risk-aware trajectories. For uncertainty quantification, we efficiently model both aleatoric and epistemic uncertainty by learning discrete traction distributions and probability densities of the traction predictor's latent features. Leveraging evidential deep learning, we parameterize Dirichlet distributions with the network outputs and propose a novel uncertainty-aware squared Earth Mover's distance loss with a closed-form expression that improves learning accuracy and navigation performance. For risk-aware navigation, the proposed planner simulates state trajectories with the worst-case expected traction to handle aleatoric uncertainty, and penalizes trajectories moving through terrain with high epistemic uncertainty. Our approach is extensively validated in simulation and on wheeled and quadruped robots, showing improved navigation performance compared to methods that assume no slip, assume the expected traction, or optimize for the worst-case expected cost.

70.5ROJun 4
Meridian: Metric-Semantic Primitive Matching for Cross-View Geo-Localization Beyond Urban Environments

Mason Peterson, Qingyuan Li, Yixuan Jia et al.

Successful robot automation requires accurate global localization to support repeatability, task planning, goal specification, and safe operation. However, reliable localization in GNSS-denied environments remains an open problem. Overhead aerial imagery offers a promising solution, but existing approaches primarily target structured urban environments and have been rarely demonstrated in unstructured natural terrain. Limitations of the state-of-the-art include a reliance on models trained for specific environments, as well as difficulty handling repetitive geometries and featureless landscapes commonly found in natural outdoor areas. To overcome these challenges, we present Meridian, a method for matching high-level metric-semantic primitives across aerial images and ground robot RGB-D camera data that achieves accurate global localization and generalizes well across diverse environments, all without any training or algorithmic fine-tuning on area-specific data. We formulate novel consistency metrics to estimate a distribution over robot submap poses and to reject outlier hypotheses in a robust pose graph optimization step for accurate robot trajectory estimation. We demonstrate that our algorithm can localize a ground robot across a wide variety of environments, including an autonomous driving dataset, a park and campus area, and a wilderness camp, with an average optimized trajectory error of 2.4 m over 19 km of ground traversal.

AIOct 31, 2025Code
Advancing AI Challenges for the United States Department of the Air Force

Christian Prothmann, Vijay Gadepally, Jeremy Kepner et al.

The DAF-MIT AI Accelerator is a collaboration between the United States Department of the Air Force (DAF) and the Massachusetts Institute of Technology (MIT). This program pioneers fundamental advances in artificial intelligence (AI) to expand the competitive advantage of the United States in the defense and civilian sectors. In recent years, AI Accelerator projects have developed and launched public challenge problems aimed at advancing AI research in priority areas. Hallmarks of AI Accelerator challenges include large, publicly available, and AI-ready datasets to stimulate open-source solutions and engage the wider academic and private sector AI ecosystem. This article supplements our previous publication, which introduced AI Accelerator challenges. We provide an update on how ongoing and new challenges have successfully contributed to AI research and applications of AI technologies.

APMar 1, 2017
Quickest Change Detection Approach to Optimal Control in Markov Decision Processes with Model Changes

Taposh Banerjee, Miao Liu, Jonathan P. How

Optimal control in non-stationary Markov decision processes (MDP) is a challenging problem. The aim in such a control problem is to maximize the long-term discounted reward when the transition dynamics or the reward function can change over time. When a prior knowledge of change statistics is available, the standard Bayesian approach to this problem is to reformulate it as a partially observable MDP (POMDP) and solve it using approximate POMDP solvers, which are typically computationally demanding. In this paper, the problem is analyzed through the viewpoint of quickest change detection (QCD), a set of tools for detecting a change in the distribution of a sequence of random variables. Current methods applying QCD to such problems only passively detect changes by following prescribed policies, without optimizing the choice of actions for long term performance. We demonstrate that ignoring the reward-detection trade-off can cause a significant loss in long term rewards, and propose a two threshold switching strategy to solve the issue. A non-Bayesian problem formulation is also proposed for scenarios where a Bayesian formulation cannot be defined. The performance of the proposed two threshold strategy is examined through numerical analysis on a non-stationary MDP task, and the strategy outperforms the state-of-the-art QCD methods in both Bayesian and non-Bayesian settings.

CVMar 7, 2023
DeepSeeColor: Realtime Adaptive Color Correction for Autonomous Underwater Vehicles via Deep Learning Methods

Stewart Jamieson, Jonathan P. How, Yogesh Girdhar

Successful applications of complex vision-based behaviours underwater have lagged behind progress in terrestrial and aerial domains. This is largely due to the degraded image quality resulting from the physical phenomena involved in underwater image formation. Spectrally-selective light attenuation drains some colors from underwater images while backscattering adds others, making it challenging to perform vision-based tasks underwater. State-of-the-art methods for underwater color correction optimize the parameters of image formation models to restore the full spectrum of color to underwater imagery. However, these methods have high computational complexity that is unfavourable for realtime use by autonomous underwater vehicles (AUVs), as a result of having been primarily designed for offline color correction. Here, we present DeepSeeColor, a novel algorithm that combines a state-of-the-art underwater image formation model with the computational efficiency of deep learning frameworks. In our experiments, we show that DeepSeeColor offers comparable performance to the popular "Sea-Thru" algorithm (Akkaynak & Treibitz, 2019) while being able to rapidly process images at up to 60Hz, thus making it suitable for use onboard AUVs as a preprocessing step to enable more robust vision-based behaviours.

CVMar 10, 2022
City-wide Street-to-Satellite Image Geolocalization of a Mobile Ground Agent

Lena M. Downes, Dong-Ki Kim, Ted J. Steiner et al.

Cross-view image geolocalization provides an estimate of an agent's global position by matching a local ground image to an overhead satellite image without the need for GPS. It is challenging to reliably match a ground image to the correct satellite image since the images have significant viewpoint differences. Existing works have demonstrated localization in constrained scenarios over small areas but have not demonstrated wider-scale localization. Our approach, called Wide-Area Geolocalization (WAG), combines a neural network with a particle filter to achieve global position estimates for agents moving in GPS-denied environments, scaling efficiently to city-scale regions. WAG introduces a trinomial loss function for a Siamese network to robustly match non-centered image pairs and thus enables the generation of a smaller satellite image database by coarsely discretizing the search area. A modified particle filter weighting scheme is also presented to improve localization accuracy and convergence. Taken together, WAG's network training and particle filter weighting approach achieves city-scale position estimation accuracies on the order of 20 meters, a 98% reduction compared to a baseline training and weighting approach. Applied to a smaller-scale testing area, WAG reduces the final position estimation error by 64% compared to a state-of-the-art baseline from the literature. WAG's search space discretization additionally significantly reduces storage and processing requirements.

RODec 24, 2022
GraffMatch: Global Matching of 3D Lines and Planes for Wide Baseline LiDAR Registration

Parker C. Lusk, Devarth Parikh, Jonathan P. How

Using geometric landmarks like lines and planes can increase navigation accuracy and decrease map storage requirements compared to commonly-used LiDAR point cloud maps. However, landmark-based registration for applications like loop closure detection is challenging because a reliable initial guess is not available. Global landmark matching has been investigated in the literature, but these methods typically use ad hoc representations of 3D line and plane landmarks that are not invariant to large viewpoint changes, resulting in incorrect matches and high registration error. To address this issue, we adopt the affine Grassmannian manifold to represent 3D lines and planes and prove that the distance between two landmarks is invariant to rotation and translation if a shift operation is performed before applying the Grassmannian metric. This invariance property enables the use of our graph-based data association framework for identifying landmark matches that can subsequently be used for registration in the least-squares sense. Evaluated on a challenging landmark matching and registration task using publicly-available LiDAR datasets, our approach yields a 1.7x and 3.5x improvement in successful registrations compared to methods that use viewpoint-dependent centroid and "closest point" representations, respectively.

SYJan 16, 2018
Active Sampling-based Binary Verification of Dynamical Systems

John F. Quindlen, Ufuk Topcu, Girish Chowdhary et al.

Nonlinear, adaptive, or otherwise complex control techniques are increasingly relied upon to ensure the safety of systems operating in uncertain environments. However, the nonlinearity of the resulting closed-loop system complicates verification that the system does in fact satisfy those requirements at all possible operating conditions. While analytical proof-based techniques and finite abstractions can be used to provably verify the closed-loop system's response at different operating conditions, they often produce conservative approximations due to restrictive assumptions and are difficult to construct in many applications. In contrast, popular statistical verification techniques relax the restrictions and instead rely upon simulations to construct statistical or probabilistic guarantees. This work presents a data-driven statistical verification procedure that instead constructs statistical learning models from simulated training data to separate the set of possible perturbations into "safe" and "unsafe" subsets. Binary evaluations of closed-loop system requirement satisfaction at various realizations of the uncertainties are obtained through temporal logic robustness metrics, which are then used to construct predictive models of requirement satisfaction over the full set of possible uncertainties. As the accuracy of these predictive statistical models is inherently coupled to the quality of the training data, an active learning algorithm selects additional sample points in order to maximize the expected change in the data-driven model and thus, indirectly, minimize the prediction error. Various case studies demonstrate the closed-loop verification procedure and highlight improvements in prediction error over both existing analytical and statistical verification techniques.

ROOct 18, 2022
Output Feedback Tube MPC-Guided Data Augmentation for Robust, Efficient Sensorimotor Policy Learning

Andrea Tagliabue, Jonathan P. How

Imitation learning (IL) can generate computationally efficient sensorimotor policies from demonstrations provided by computationally expensive model-based sensing and control algorithms. However, commonly employed IL methods are often data-inefficient, requiring the collection of a large number of demonstrations and producing policies with limited robustness to uncertainties. In this work, we combine IL with an output feedback robust tube model predictive controller (RTMPC) to co-generate demonstrations and a data augmentation strategy to efficiently learn neural network-based sensorimotor policies. Thanks to the augmented data, we reduce the computation time and the number of demonstrations needed by IL, while providing robustness to sensing and process uncertainty. We tailor our approach to the task of learning a trajectory tracking visuomotor policy for an aerial robot, leveraging a 3D mesh of the environment as part of the data augmentation process. We numerically demonstrate that our method can learn a robust visuomotor policy from a single demonstration--a two-orders of magnitude improvement in demonstration efficiency compared to existing IL methods.

SYOct 1, 2017
Active Sampling for Constrained Simulation-based Verification of Uncertain Nonlinear Systems

John F. Quindlen, Ufuk Topcu, Girish Chowdhary et al.

Increasingly demanding performance requirements for dynamical systems motivates the adoption of nonlinear and adaptive control techniques. One challenge is the nonlinearity of the resulting closed-loop system complicates verification that the system does satisfy the requirements at all possible operating conditions. This paper presents a data-driven procedure for efficient simulation-based, statistical verification without the reliance upon exhaustive simulations. In contrast to previous work, this approach introduces a method for online estimation of prediction accuracy without the use of external validation sets. This work also develops a novel active sampling algorithm that iteratively selects additional training points in order to maximize the accuracy of the predictions while still limited to a sample budget. Three case studies demonstrate the utility of the new approach and the results show up to a 50% improvement over state-of-the-art techniques.

ROSep 23, 2022
Wide-Area Geolocalization with a Limited Field of View Camera

Lena M. Downes, Ted J. Steiner, Rebecca L. Russell et al.

Cross-view geolocalization, a supplement or replacement for GPS, localizes an agent within a search area by matching images taken from a ground-view camera to overhead images taken from satellites or aircraft. Although the viewpoint disparity between ground and overhead images makes cross-view geolocalization challenging, significant progress has been made assuming that the ground agent has access to a panoramic camera. For example, our prior work (WAG) introduced changes in search area discretization, training loss, and particle filter weighting that enabled city-scale panoramic cross-view geolocalization. However, panoramic cameras are not widely used in existing robotic platforms due to their complexity and cost. Non-panoramic cross-view geolocalization is more applicable for robotics, but is also more challenging. This paper presents Restricted FOV Wide-Area Geolocalization (ReWAG), a cross-view geolocalization approach that generalizes WAG for use with standard, non-panoramic ground cameras by creating pose-aware embeddings and providing a strategy to incorporate particle pose into the Siamese network. ReWAG is a neural network and particle filter system that is able to globally localize a mobile agent in a GPS-denied environment with only odometry and a 90 degree FOV camera, achieving similar localization accuracy as what WAG achieved with a panoramic camera and improving localization accuracy by a factor of 100 compared to a baseline vision transformer (ViT) approach. A video highlight that demonstrates ReWAG's convergence on a test path of several dozen kilometers is available at https://youtu.be/U_OBQrt8qCE.

SYOct 1, 2017
Closed-Loop Statistical Verification of Stochastic Nonlinear Systems Subject to Parametric Uncertainties

John F. Quindlen, Ufuk Topcu, Girish Chowdhary et al.

This paper proposes a statistical verification framework using Gaussian processes (GPs) for simulation-based verification of stochastic nonlinear systems with parametric uncertainties. Given a small number of stochastic simulations, the proposed framework constructs a GP regression model and predicts the system's performance over the entire set of possible uncertainties. Included in the framework is a new metric to estimate the confidence in those predictions based on the variance of the GP's cumulative distribution function. This variance-based metric forms the basis of active sampling algorithms that aim to minimize prediction error through careful selection of simulations. In three case studies, the new active sampling algorithms demonstrate up to a 35% improvement in prediction error over other approaches and are able to correctly identify regions with low prediction confidence through the variance metric.

GTOct 28, 2022
Game-Theoretical Perspectives on Active Equilibria: A Preferred Solution Concept over Nash Equilibria

Dong-Ki Kim, Matthew Riemer, Miao Liu et al.

Multiagent learning settings are inherently more difficult than single-agent learning because each agent interacts with other simultaneously learning agents in a shared environment. An effective approach in multiagent reinforcement learning is to consider the learning process of agents and influence their future policies toward desirable behaviors from each agent's perspective. Importantly, if each agent maximizes its long-term rewards by accounting for the impact of its behavior on the set of convergence policies, the resulting multiagent system reaches an active equilibrium. While this new solution concept is general such that standard solution concepts, such as a Nash equilibrium, are special cases of active equilibria, it is unclear when an active equilibrium is a preferred equilibrium over other solution concepts. In this paper, we analyze active equilibria from a game-theoretic perspective by closely studying examples where Nash equilibria are known. By directly comparing active equilibria to Nash equilibria in these examples, we find that active equilibria find more effective solutions than Nash equilibria, concluding that an active equilibrium is the desired solution for multiagent learning settings.

99.4SYMar 16
A Forward Reachability Perspective on Control Barrier Functions and Discount Factors in Reachability Analysis

Jason J. Choi, Donggun Lee, Boyang Li et al.

Control invariant sets are crucial for various methods that aim to design safe control policies for systems whose state constraints must be satisfied over an indefinite time horizon. In this article, we explore the connections among reachability, control invariance, and Control Barrier Functions (CBFs). Unlike prior formulations based on backward reachability concepts, we establish a strong link between these three concepts by examining the inevitable Forward Reachable Tube (FRT), which is the set of states such that every trajectory reaching the FRT must have passed through a given initial set of states. First, our findings show that the inevitable FRT is a robust control invariant set if it has a continuously differentiable boundary. If the boundary is not differentiable, the FRT may lose invariance. We also show that any robust control invariant set including the initial set is a superset of the FRT if the boundary of the invariant set is differentiable. Next, we formulate a differential game between the control and disturbance, where the inevitable FRT is characterized by the zero-superlevel set of the value function. By incorporating a discount factor in the cost function of the game, the barrier constraint of the CBF naturally arises in the Hamilton-Jacobi (HJ) equation and determines the optimal policy. The resulting FRT value function serves as a CBF-like function, and conversely, any valid CBF is also a forward reachability value function. We further prove that any $C^1$ supersolution of the HJ equation for the FRT value functions is a valid CBF and characterizes a robust control invariant set that outer-approximates the FRT. Building on this property, finally, we devise a novel method that learns neural control barrier functions, which learn an control invariant superset of the FRT of a given initial set.

ROSep 28, 2022
View-Invariant Localization using Semantic Objects in Changing Environments

Jacqueline Ankenbauer, Kaveh Fathian, Jonathan P. How

This paper proposes a novel framework for real-time localization and egomotion tracking of a vehicle in a reference map. The core idea is to map the semantic objects observed by the vehicle and register them to their corresponding objects in the reference map. While several recent works have leveraged semantic information for cross-view localization, the main contribution of this work is a view-invariant formulation that makes the approach directly applicable to any viewpoint configuration for which objects are detectable. Another distinctive feature is robustness to changes in the environment/objects due to a data association scheme suited for extreme outlier regimes (e.g., 90% association outliers). To demonstrate our framework, we consider an example of localizing a ground vehicle in a reference object map using only cars as objects. While only a stereo camera is used for the ground vehicle, we consider reference maps constructed a priori from ground viewpoints using stereo cameras and Lidar scans, and georeferenced aerial images captured at a different date to demonstrate the framework's robustness to different modalities, viewpoints, and environment changes. Evaluations on the KITTI dataset show that over a 3.7 km trajectory, localization occurs in 36 sec and is followed by real-time egomotion tracking with an average position error of 8.5 m in a Lidar reference map, and on an aerial object map where 77% of objects are outliers, localization is achieved in 71 sec with an average position error of 7.9 m.

LGMar 14, 2022
Safe adaptation in multiagent competition

Macheng Shen, Jonathan P. How

Achieving the capability of adapting to ever-changing environments is a critical step towards building fully autonomous robots that operate safely in complicated scenarios. In multiagent competitive scenarios, agents may have to adapt to new opponents with previously unseen behaviors by learning from the interaction experiences between the ego-agent and the opponent. However, this adaptation is susceptible to opponent exploitation. As the ego-agent updates its own behavior to exploit the opponent, its own behavior could become more exploitable as a result of overfitting to this specific opponent's behavior. To overcome this difficulty, we developed a safe adaptation approach in which the ego-agent is trained against a regularized opponent model, which effectively avoids overfitting and consequently improves the robustness of the ego-agent's policy. We evaluated our approach in the Mujoco domain with two competing agents. The experiment results suggest that our approach effectively achieves both adaptation to the specific opponent that the ego-agent is interacting with and maintaining low exploitability to other possible opponent exploitation.

ROSep 20, 2022
Robust, High-Rate Trajectory Tracking on Insect-Scale Soft-Actuated Aerial Robots with Deep-Learned Tube MPC

Andrea Tagliabue, Yi-Hsuan Hsiao, Urban Fasel et al.

Accurate and agile trajectory tracking in sub-gram Micro Aerial Vehicles (MAVs) is challenging, as the small scale of the robot induces large model uncertainties, demanding robust feedback controllers, while the fast dynamics and computational constraints prevent the deployment of computationally expensive strategies. In this work, we present an approach for agile and computationally efficient trajectory tracking on the MIT SoftFly, a sub-gram MAV (0.7 grams). Our strategy employs a cascaded control scheme, where an adaptive attitude controller is combined with a neural network policy trained to imitate a trajectory tracking robust tube model predictive controller (RTMPC). The neural network policy is obtained using our recent work, which enables the policy to preserve the robustness of RTMPC, but at a fraction of its computational cost. We experimentally evaluate our approach, achieving position Root Mean Square Errors lower than 1.8 cm even in the more challenging maneuvers, obtaining a 60% reduction in maximum position error compared to our previous work, and demonstrating robustness to large external disturbances

ROMar 28, 2023
Efficient Deep Learning of Robust, Adaptive Policies using Tube MPC-Guided Data Augmentation

Tong Zhao, Andrea Tagliabue, Jonathan P. How

The deployment of agile autonomous systems in challenging, unstructured environments requires adaptation capabilities and robustness to uncertainties. Existing robust and adaptive controllers, such as those based on model predictive control (MPC), can achieve impressive performance at the cost of heavy online onboard computations. Strategies that efficiently learn robust and onboard-deployable policies from MPC have emerged, but they still lack fundamental adaptation capabilities. In this work, we extend an existing efficient Imitation Learning (IL) algorithm for robust policy learning from MPC with the ability to learn policies that adapt to challenging model/environment uncertainties. The key idea of our approach consists in modifying the IL procedure by conditioning the policy on a learned lower-dimensional model/environment representation that can be efficiently estimated online. We tailor our approach to the task of learning an adaptive position and attitude control policy to track trajectories under challenging disturbances on a multirotor. Evaluations in simulation show that a high-quality adaptive policy can be obtained in about $1.3$ hours. We additionally empirically demonstrate rapid adaptation to in- and out-of-training-distribution uncertainties, achieving a $6.1$ cm average position error under wind disturbances that correspond to about $50\%$ of the weight of the robot, and that are $36\%$ larger than the maximum wind seen during training.

OCMar 30, 2023
Surrogate Neural Networks for Efficient Simulation-based Trajectory Planning Optimization

Evelyn Ruff, Rebecca Russell, Matthew Stoeckle et al.

This paper presents a novel methodology that uses surrogate models in the form of neural networks to reduce the computation time of simulation-based optimization of a reference trajectory. Simulation-based optimization is necessary when there is no analytical form of the system accessible, only input-output data that can be used to create a surrogate model of the simulation. Like many high-fidelity simulations, this trajectory planning simulation is very nonlinear and computationally expensive, making it challenging to optimize iteratively. Through gradient descent optimization, our approach finds the optimal reference trajectory for landing a hypersonic vehicle. In contrast to the large datasets used to create the surrogate models in prior literature, our methodology is specifically designed to minimize the number of simulation executions required by the gradient descent optimizer. We demonstrated this methodology to be more efficient than the standard practice of hand-tuning the inputs through trial-and-error or randomly sampling the input parameter space. Due to the intelligently selected input values to the simulation, our approach yields better simulation outcomes that are achieved more rapidly and to a higher degree of accuracy. Optimizing the hypersonic vehicle's reference trajectory is very challenging due to the simulation's extreme nonlinearity, but even so, this novel approach found a 74% better-performing reference trajectory compared to nominal, and the numerical results clearly show a substantial reduction in computation time for designing future trajectories.

ROSep 4, 2024
PIETRA: Physics-Informed Evidential Learning for Traversing Out-of-Distribution Terrain

Xiaoyi Cai, James Queeney, Tong Xu et al.

Self-supervised learning is a powerful approach for developing traversability models for off-road navigation, but these models often struggle with inputs unseen during training. Existing methods utilize techniques like evidential deep learning to quantify model uncertainty, helping to identify and avoid out-of-distribution terrain. However, always avoiding out-of-distribution terrain can be overly conservative, e.g., when novel terrain can be effectively analyzed using a physics-based model. To overcome this challenge, we introduce Physics-Informed Evidential Traversability (PIETRA), a self-supervised learning framework that integrates physics priors directly into the mathematical formulation of evidential neural networks and introduces physics knowledge implicitly through an uncertainty-aware, physics-informed training loss. Our evidential network seamlessly transitions between learned and physics-based predictions for out-of-distribution inputs. Additionally, the physics-informed loss regularizes the learned model, ensuring better alignment with the physics model. Extensive simulations and hardware experiments demonstrate that PIETRA improves both learning accuracy and navigation performance in environments with significant distribution shifts.

RONov 23, 2023
Tube-NeRF: Efficient Imitation Learning of Visuomotor Policies from MPC using Tube-Guided Data Augmentation and NeRFs

Andrea Tagliabue, Jonathan P. How

Imitation learning (IL) can train computationally-efficient sensorimotor policies from a resource-intensive Model Predictive Controller (MPC), but it often requires many samples, leading to long training times or limited robustness. To address these issues, we combine IL with a variant of robust MPC that accounts for process and sensing uncertainties, and we design a data augmentation (DA) strategy that enables efficient learning of vision-based policies. The proposed DA method, named Tube-NeRF, leverages Neural Radiance Fields (NeRFs) to generate novel synthetic images, and uses properties of the robust MPC (the tube) to select relevant views and to efficiently compute the corresponding actions. We tailor our approach to the task of localization and trajectory tracking on a multirotor, by learning a visuomotor policy that generates control actions using images from the onboard camera as only source of horizontal position. Numerical evaluations show 80-fold increase in demonstration efficiency and a 50% reduction in training time over current IL methods. Additionally, our policies successfully transfer to a real multirotor, achieving low tracking errors despite large disturbances, with an onboard inference time of only 1.5 ms. Video: https://youtu.be/_W5z33ZK1m4

55.8ROApr 8
SANDO: Safe Autonomous Trajectory Planning for Dynamic Unknown Environments

Kota Kondo, Jesús Tordesillas, Jonathan P. How

SANDO is a safe trajectory planner for 3D dynamic unknown environments, where obstacle locations and motions are unknown a priori and a collision-free plan can become unsafe at any moment, requiring fast replanning. Existing soft-constraint planners are fast but cannot guarantee collision-free paths, while hard-constraint methods ensure safety at the cost of longer computation. SANDO addresses this trade-off through three contributions. First, a heat map-based A* global planner steers paths away from high-risk regions using soft costs, and a spatiotemporal safe flight corridor (STSFC) generator produces time-layered polytopes that inflate obstacles only by their worst-case reachable set at each time layer, rather than by the worst case over the entire horizon. Second, trajectory optimization is formulated as a Mixed-Integer Quadratic Program (MIQP) with hard collision-avoidance constraints, and a variable elimination technique reduces the number of decision variables, enabling fast computation. Third, a formal safety analysis establishes collision-free guarantees under explicit velocity-bound and estimation-error assumptions. Ablation studies show that variable elimination yields up to 7.4x speedup in optimization time, and that STSFCs are critical for feasibility in dense dynamic environments. Benchmark simulations against state-of-the-art methods across standardized static benchmarks, obstacle-rich static forests, and dynamic environments show that SANDO consistently achieves the highest success rate with no constraint violations across all difficulty levels; perception-only experiments without ground truth obstacle information confirm robust performance under realistic sensing. Hardware experiments on a UAV with fully onboard planning, perception, and localization demonstrate six safe flights in static environments and ten safe flights among dynamic obstacles.

ROSep 30, 2024
An Overview of the Burer-Monteiro Method for Certifiable Robot Perception

Alan Papalia, Yulun Tian, David M. Rosen et al.

This paper presents an overview of the Burer-Monteiro method (BM), a technique that has been applied to solve robot perception problems to certifiable optimality in real-time. BM is often used to solve semidefinite programming relaxations, which can be used to perform global optimization for non-convex perception problems. Specifically, BM leverages the low-rank structure of typical semidefinite programs to dramatically reduce the computational cost of performing optimization. This paper discusses BM in certifiable perception, with three main objectives: (i) to consolidate information from the literature into a unified presentation, (ii) to elucidate the role of the linear independence constraint qualification (LICQ), a concept not yet well-covered in certifiable perception literature, and (iii) to share practical considerations that are discussed among practitioners but not thoroughly covered in the literature. Our general aim is to offer a practical primer for applying BM towards certifiable perception.

ROFeb 11, 2024Code
CLIPPER: Robust Data Association without an Initial Guess

Parker C. Lusk, Jonathan P. How

Identifying correspondences in noisy data is a critically important step in estimation processes. When an informative initial estimation guess is available, the data association challenge is less acute; however, the existence of a high-quality initial guess is rare in most contexts. We explore graph-theoretic formulations for data association, which do not require an initial estimation guess. Existing graph-theoretic approaches optimize over unweighted graphs, discarding important consistency information encoded in weighted edges, and frequently attempt to solve NP-hard problems exactly. In contrast, we formulate a new optimization problem that fully leverages weighted graphs and seeks the densest edge-weighted clique. We introduce two relaxations to this problem: a convex semidefinite relaxation which we find to be empirically tight, and a fast first-order algorithm called CLIPPER which frequently arrives at nearly-optimal solutions in milliseconds. When evaluated on point cloud registration problems, our algorithms remain robust up to at least 95% outliers while existing algorithms begin breaking down at 80% outliers. Code is available at https://mit-acl.github.io/clipper.

SYSep 30, 2024
Constraint-Aware Refinement for Safety Verification of Neural Feedback Loops

Nicholas Rober, Jonathan P. How

Neural networks (NNs) are becoming increasingly popular in the design of control pipelines for autonomous systems. However, since the performance of NNs can degrade in the presence of out-of-distribution data or adversarial attacks, systems that have NNs in their control pipelines, i.e., neural feedback loops (NFLs), need safety assurances before they can be applied in safety-critical situations. Reachability analysis offers a solution to this problem by calculating reachable sets that bound the possible future states of an NFL and can be checked against dangerous regions of the state space to verify that the system does not violate safety constraints. Since exact reachable sets are generally intractable to calculate, reachable set over approximations (RSOAs) are typically used. The problem with RSOAs is that they can be overly conservative, making it difficult to verify the satisfaction of safety constraints, especially over long time horizons or for highly nonlinear NN control policies. Refinement strategies such as partitioning or symbolic propagation are typically used to limit the conservativeness of RSOAs, but these approaches come with a high computational cost and often can only be used to verify safety for simple reachability problems. This paper presents Constraint-Aware Refinement for Verification (CARV): an efficient refinement strategy that reduces the conservativeness of RSOAs by explicitly using the safety constraints on the NFL to refine RSOAs only where necessary. We demonstrate that CARV can verify the safety of an NFL where other approaches either fail or take up to 60x longer and 40x the memory.

44.3ROMar 25
MIGHTY: Hermite Spline-based Efficient Trajectory Planning

Kota Kondo, Yuwei Wu, Vijay Kumar et al.

Hard-constraint trajectory planners often rely on commercial solvers and demand substantial computational resources. Existing soft-constraint methods achieve faster computation, but either (1) decouple spatial and temporal optimization or (2) restrict the search space. To overcome these limitations, we introduce MIGHTY, a Hermite spline-based planner that performs spatiotemporal optimization while fully leveraging the continuous search space of a spline. In simulation, MIGHTY achieves a 9.3% reduction in computation time and a 13.1% reduction in travel time over state-of-the-art baselines, with a 100% success rate. In hardware, MIGHTY completes multiple high-speed flights up to 6.7 m/s in a cluttered static environment and long-duration flights with dynamically added obstacles.

24.4ROMar 19
Uncertainty-Aware Multi-Robot Task Allocation With Strongly Coupled Inter-Robot Rewards

Ben Rossano, Jaein Lim, Jonathan P. How

Allocating tasks to heterogeneous robot teams in environments with uncertain task requirements is a fundamentally challenging problem. Redundantly assigning multiple robots to such tasks is overly conservative, while purely reactive strategies risk costly delays in task completion when the uncertain capabilities become necessary. This paper introduces an auction-based task allocation algorithm that explicitly models uncertain task requirements, leveraging a novel strongly coupled formulation to allocate tasks such that robots with potentially required capabilities are naturally positioned near uncertain tasks. This approach enables robots to remain productive on nearby tasks while simultaneously mitigating large delays in completion time when their capabilities are required. Through a set of simulated disaster relief missions with task deadline constraints, we demonstrate that the proposed approach yields up to a 15% increase in expected mission value compared to redundancy-based methods. Furthermore, we propose a novel framework to approximate uncertainty arising from unmodeled changes in task requirements by leveraging the natural delay between encountering unexpected environmental conditions and confirming whether additional capabilities are required to complete a task. We show that our approach achieves up to an 18% increase in expected mission value using this framework compared to reactive methods that don't leverage this delay.

CVJun 20, 2025Code
LunarLoc: Segment-Based Global Localization on the Moon

Annika Thomas, Robaire Galliath, Aleksander Garbuz et al.

Global localization is necessary for autonomous operations on the lunar surface where traditional Earth-based navigation infrastructure, such as GPS, is unavailable. As NASA advances toward sustained lunar presence under the Artemis program, autonomous operations will be an essential component of tasks such as robotic exploration and infrastructure deployment. Tasks such as excavation and transport of regolith require precise pose estimation, but proposed approaches such as visual-inertial odometry (VIO) accumulate odometry drift over long traverses. Precise pose estimation is particularly important for upcoming missions such as the ISRU Pilot Excavator (IPEx) that rely on autonomous agents to operate over extended timescales and varied terrain. To help overcome odometry drift over long traverses, we propose LunarLoc, an approach to global localization that leverages instance segmentation for zero-shot extraction of boulder landmarks from onboard stereo imagery. Segment detections are used to construct a graph-based representation of the terrain, which is then aligned with a reference map of the environment captured during a previous session using graph-theoretic data association. This method enables accurate and drift-free global localization in visually ambiguous settings. LunarLoc achieves sub-cm level accuracy in multi-session global localization experiments, significantly outperforming the state of the art in lunar global localization. To encourage the development of further methods for global localization on the Moon, we release our datasets publicly with a playback module: https://github.com/mit-acl/lunarloc-data.

RONov 20, 2020Code
CLIPPER: A Graph-Theoretic Framework for Robust Data Association

Parker C. Lusk, Kaveh Fathian, Jonathan P. How

We present CLIPPER (Consistent LInking, Pruning, and Pairwise Error Rectification), a framework for robust data association in the presence of noise and outliers. We formulate the problem in a graph-theoretic framework using the notion of geometric consistency. State-of-the-art techniques that use this framework utilize either combinatorial optimization techniques that do not scale well to large-sized problems, or use heuristic approximations that yield low accuracy in high-noise, high-outlier regimes. In contrast, CLIPPER uses a relaxation of the combinatorial problem and returns solutions that are guaranteed to correspond to the optima of the original problem. Low time complexity is achieved with an efficient projected gradient ascent approach. Experiments indicate that CLIPPER maintains a consistently low runtime of 15 ms where exact methods can require up to 24 s at their peak, even on small-sized problems with 200 associations. When evaluated on noisy point cloud registration problems, CLIPPER achieves 100% precision and 98% recall in 90% outlier regimes while competing algorithms begin degrading by 70% outliers. In an instance of associating noisy points of the Stanford Bunny with 990 outlier associations and only 10 inlier associations, CLIPPER successfully returns 8 inlier associations with 100% precision in 138 ms. Code is available at https://mit-acl.github.io/clipper.

AINov 28, 2017Code
Crossmodal Attentive Skill Learner

Shayegan Omidshafiei, Dong-Ki Kim, Jason Pazis et al.

This paper presents the Crossmodal Attentive Skill Learner (CASL), integrated with the recently-introduced Asynchronous Advantage Option-Critic (A2OC) architecture [Harb et al., 2017] to enable hierarchical reinforcement learning across multiple sensory inputs. We provide concrete examples where the approach not only improves performance in a single task, but accelerates transfer to new tasks. We demonstrate the attention mechanism anticipates and identifies useful latent features, while filtering irrelevant sensor modalities during execution. We modify the Arcade Learning Environment [Bellemare et al., 2013] to support audio queries, and conduct evaluations of crossmodal learning in the Atari 2600 game Amidar. Finally, building on the recent work of Babaeizadeh et al. [2017], we open-source a fast hybrid CPU-GPU implementation of CASL.

42.8ROMar 16
Robust Dynamic Object Detection in Cluttered Indoor Scenes via Learned Spatiotemporal Cues

Juan Rached, Yixuan Jia, Kota Kondo et al.

Reliable dynamic object detection in cluttered environments remains a critical challenge for autonomous navigation. Purely geometric LiDAR pipelines that rely on clustering and heuristic filtering can miss dynamic obstacles when they move in close proximity to static structure or are only partially observed. Vision-augmented approaches can provide additional semantic cues, but are often limited by closed-set detectors and camera field-of-view constraints, reducing robustness to novel obstacles and out-of-frustum events. In this work, we present a LiDAR-only framework that fuses temporal occupancy-grid-based motion segmentation with a learned bird's-eye-view (BEV) dynamic prior. A fusion module prioritizes 3D detections when available, while using the learned dynamic grid to recover detections that would otherwise be lost due to proximity-induced false negatives. Experiments with motion-capture ground truth show our method achieves 28.67% higher recall and 18.50% higher F1 score than the state-of-the-art in substantially cluttered environments while maintaining comparable precision and position error.

ROMay 2, 2024
CGD: Constraint-Guided Diffusion Policies for UAV Trajectory Planning

Kota Kondo, Andrea Tagliabue, Xiaoyi Cai et al.

Traditional optimization-based planners, while effective, suffer from high computational costs, resulting in slow trajectory generation. A successful strategy to reduce computation time involves using Imitation Learning (IL) to develop fast neural network (NN) policies from those planners, which are treated as expert demonstrators. Although the resulting NN policies are effective at quickly generating trajectories similar to those from the expert, (1) their output does not explicitly account for dynamic feasibility, and (2) the policies do not accommodate changes in the constraints different from those used during training. To overcome these limitations, we propose Constraint-Guided Diffusion (CGD), a novel IL-based approach to trajectory planning. CGD leverages a hybrid learning/online optimization scheme that combines diffusion policies with a surrogate efficient optimization problem, enabling the generation of collision-free, dynamically feasible trajectories. The key ideas of CGD include dividing the original challenging optimization problem solved by the expert into two more manageable sub-problems: (a) efficiently finding collision-free paths, and (b) determining a dynamically-feasible time-parametrization for those paths to obtain a trajectory. Compared to conventional neural network architectures, we demonstrate through numerical evaluations significant improvements in performance and dynamic feasibility under scenarios with new constraints never encountered during training.

ROJan 9, 2024
SOS-Match: Segmentation for Open-Set Robust Correspondence Search and Robot Localization in Unstructured Environments

Annika Thomas, Jouko Kinnari, Parker Lusk et al.

We present SOS-Match, a novel framework for detecting and matching objects in unstructured environments. Our system consists of 1) a front-end mapping pipeline using a zero-shot segmentation model to extract object masks from images and track them across frames and 2) a frame alignment pipeline that uses the geometric consistency of object relationships to efficiently localize across a variety of conditions. We evaluate SOS-Match on the Batvik seasonal dataset which includes drone flights collected over a coastal plot of southern Finland during different seasons and lighting conditions. Results show that our approach is more robust to changes in lighting and appearance than classical image feature-based approaches or global descriptor methods, and it provides more viewpoint invariance than learning-based feature detection and description approaches. SOS-Match localizes within a reference map up to 46x faster than other feature-based approaches and has a map size less than 0.5% the size of the most compact other maps. SOS-Match is a promising new approach for landmark detection and correspondence search in unstructured environments that is robust to changes in lighting and appearance and is more computationally efficient than other approaches, suggesting that the geometric arrangement of segments is a valuable localization cue in unstructured environments. We release our datasets at https://acl.mit.edu/SOS-Match/.

ROJun 9, 2025
Language-Grounded Hierarchical Planning and Execution with Multi-Robot 3D Scene Graphs

Jared Strader, Aaron Ray, Jacob Arkin et al.

In this paper, we introduce a multi-robot system that integrates mapping, localization, and task and motion planning (TAMP) enabled by 3D scene graphs to execute complex instructions expressed in natural language. Our system builds a shared 3D scene graph incorporating an open-set object-based map, which is leveraged for multi-robot 3D scene graph fusion. This representation supports real-time, view-invariant relocalization (via the object-based map) and planning (via the 3D scene graph), allowing a team of robots to reason about their surroundings and execute complex tasks. Additionally, we introduce a planning approach that translates operator intent into Planning Domain Definition Language (PDDL) goals using a Large Language Model (LLM) by leveraging context from the shared 3D scene graph and robot capabilities. We provide an experimental assessment of the performance of our system on real-world tasks in large-scale, outdoor environments. A supplementary video is available at https://youtu.be/8xbGGOLfLAY.

LGDec 5, 2024
GRAM: Generalization in Deep RL with a Robust Adaptation Module

James Queeney, Xiaoyi Cai, Alexander Schperberg et al.

The reliable deployment of deep reinforcement learning in real-world settings requires the ability to generalize across a variety of conditions, including both in-distribution scenarios seen during training as well as novel out-of-distribution scenarios. In this work, we present a framework for dynamics generalization in deep reinforcement learning that unifies these two distinct types of generalization within a single architecture. We introduce a robust adaptation module that provides a mechanism for identifying and reacting to both in-distribution and out-of-distribution environment dynamics, along with a joint training pipeline that combines the goals of in-distribution adaptation and out-of-distribution robustness. Our algorithm GRAM achieves strong generalization performance across in-distribution and out-of-distribution scenarios upon deployment, which we demonstrate through extensive simulation and hardware locomotion experiments on a quadruped robot.

ROJun 23, 2025
GRAND-SLAM: Local Optimization for Globally Consistent Large-Scale Multi-Agent Gaussian SLAM

Annika Thomas, Aneesa Sonawalla, Alex Rose et al.

3D Gaussian splatting has emerged as an expressive scene representation for RGB-D visual SLAM, but its application to large-scale, multi-agent outdoor environments remains unexplored. Multi-agent Gaussian SLAM is a promising approach to rapid exploration and reconstruction of environments, offering scalable environment representations, but existing approaches are limited to small-scale, indoor environments. To that end, we propose Gaussian Reconstruction via Multi-Agent Dense SLAM, or GRAND-SLAM, a collaborative Gaussian splatting SLAM method that integrates i) an implicit tracking module based on local optimization over submaps and ii) an approach to inter- and intra-robot loop closure integrated into a pose-graph optimization framework. Experiments show that GRAND-SLAM provides state-of-the-art tracking performance and 28% higher PSNR than existing methods on the Replica indoor dataset, as well as 91% lower multi-agent tracking error and improved rendering over existing multi-agent methods on the large-scale, outdoor Kimera-Multi dataset.

RODec 13, 2025
Navigation Around Unknown Space Objects Using Visible-Thermal Image Fusion

Eric J. Elias, Michael Esswein, Jonathan P. How et al.

As the popularity of on-orbit operations grows, so does the need for precise navigation around unknown resident space objects (RSOs) such as other spacecraft, orbital debris, and asteroids. The use of Simultaneous Localization and Mapping (SLAM) algorithms is often studied as a method to map out the surface of an RSO and find the inspector's relative pose using a lidar or conventional camera. However, conventional cameras struggle during eclipse or shadowed periods, and lidar, though robust to lighting conditions, tends to be heavier, bulkier, and more power-intensive. Thermal-infrared cameras can track the target RSO throughout difficult illumination conditions without these limitations. While useful, thermal-infrared imagery lacks the resolution and feature-richness of visible cameras. In this work, images of a target satellite in low Earth orbit are photo-realistically simulated in both visible and thermal-infrared bands. Pixel-level fusion methods are used to create visible/thermal-infrared composites that leverage the best aspects of each camera. Navigation errors from a monocular SLAM algorithm are compared between visible, thermal-infrared, and fused imagery in various lighting and trajectories. Fused imagery yields substantially improved navigation performance over visible-only and thermal-only methods.

ROAug 5, 2025
Aerobatic maneuvers in insect-scale flapping-wing aerial robots via deep-learned robust tube model predictive control

Yi-Hsuan Hsiao, Andrea Tagliabue, Owen Matteson et al.

Aerial insects exhibit highly agile maneuvers such as sharp braking, saccades, and body flips under disturbance. In contrast, insect-scale aerial robots are limited to tracking non-aggressive trajectories with small body acceleration. This performance gap is contributed by a combination of low robot inertia, fast dynamics, uncertainty in flapping-wing aerodynamics, and high susceptibility to environmental disturbance. Executing highly dynamic maneuvers requires the generation of aggressive flight trajectories that push against the hardware limit and a high-rate feedback controller that accounts for model and environmental uncertainty. Here, through designing a deep-learned robust tube model predictive controller, we showcase insect-like flight agility and robustness in a 750-millgram flapping-wing robot. Our model predictive controller can track aggressive flight trajectories under disturbance. To achieve a high feedback rate in a compute-constrained real-time system, we design imitation learning methods to train a two-layer, fully connected neural network, which resembles insect flight control architecture consisting of central nervous system and motor neurons. Our robot demonstrates insect-like saccade movements with lateral speed and acceleration of 197 centimeters per second and 11.7 meters per second square, representing 447$\%$ and 255$\%$ improvement over prior results. The robot can also perform saccade maneuvers under 160 centimeters per second wind disturbance and large command-to-force mapping errors. Furthermore, it performs 10 consecutive body flips in 11 seconds - the most challenging maneuver among sub-gram flyers. These results represent a milestone in achieving insect-scale flight agility and inspire future investigations on sensing and compute autonomy.

ROJun 14, 2024
PRIMER: Perception-Aware Robust Learning-based Multiagent Trajectory Planner

Kota Kondo, Claudius T. Tewari, Andrea Tagliabue et al.

In decentralized multiagent trajectory planners, agents need to communicate and exchange their positions to generate collision-free trajectories. However, due to localization errors/uncertainties, trajectory deconfliction can fail even if trajectories are perfectly shared between agents. To address this issue, we first present PARM and PARM*, perception-aware, decentralized, asynchronous multiagent trajectory planners that enable a team of agents to navigate uncertain environments while deconflicting trajectories and avoiding obstacles using perception information. PARM* differs from PARM as it is less conservative, using more computation to find closer-to-optimal solutions. While these methods achieve state-of-the-art performance, they suffer from high computational costs as they need to solve large optimization problems onboard, making it difficult for agents to replan at high rates. To overcome this challenge, we present our second key contribution, PRIMER, a learning-based planner trained with imitation learning (IL) using PARM* as the expert demonstrator. PRIMER leverages the low computational requirements at deployment of neural networks and achieves a computation speed up to 5500 times faster than optimization-based approaches.

LGMay 6, 2024
Out-of-Distribution Adaptation in Offline RL: Counterfactual Reasoning via Causal Normalizing Flows

Minjae Cho, Jonathan P. How, Chuangchuang Sun

Despite notable successes of Reinforcement Learning (RL), the prevalent use of an online learning paradigm prevents its widespread adoption, especially in hazardous or costly scenarios. Offline RL has emerged as an alternative solution, learning from pre-collected static datasets. However, this offline learning introduces a new challenge known as distributional shift, degrading the performance when the policy is evaluated on scenarios that are Out-Of-Distribution (OOD) from the training dataset. Most existing offline RL resolves this issue by regularizing policy learning within the information supported by the given dataset. However, such regularization overlooks the potential for high-reward regions that may exist beyond the dataset. This motivates exploring novel offline learning techniques that can make improvements beyond the data support without compromising policy performance, potentially by learning causation (cause-and-effect) instead of correlation from the dataset. In this paper, we propose the MOOD-CRL (Model-based Offline OOD-Adapting Causal RL) algorithm, which aims to address the challenge of extrapolation for offline policy training through causal inference instead of policy-regularizing methods. Specifically, Causal Normalizing Flow (CNF) is developed to learn the transition and reward functions for data generation and augmentation in offline policy evaluation and training. Based on the data-invariant, physics-based qualitative causal graph and the observational data, we develop a novel learning scheme for CNF to learn the quantitative structural causal model. As a result, CNF gains predictive and counterfactual reasoning capabilities for sequential decision-making tasks, revealing a high potential for OOD adaptation. Our CNF-based offline RL approach is validated through empirical evaluations, outperforming model-free and model-based methods by a significant margin.

RONov 29, 2021
MIXER: A Principled Framework for Multimodal, Multiway Data Association

Parker C. Lusk, Ronak Roy, Kaveh Fathian et al.

A fundamental problem in robotic perception is matching identical objects or data, with applications such as loop closure detection, place recognition, object tracking, and map fusion. While the problem becomes considerably more challenging when matching should be done jointly across multiple, multimodal sets of data, the robustness and accuracy of matching in the presence of noise and outliers can be greatly improved in this setting. At present, multimodal techniques do not leverage multiway information, and multiway techniques do not incorporate different modalities, leading to inferior results. In contrast, we present a principled mixed-integer quadratic framework to address this issue. We use a novel continuous relaxation in a projected gradient descent algorithm that guarantees feasible solutions of the integer program are obtained efficiently. We demonstrate experimentally that correspondences obtained from our approach are more stable to noise and errors than state-of-the-art techniques. Tested on a robotics dataset, our algorithm resulted in a 35% increase in F1 score when compared to the best alternative.

ROOct 2, 2021
Incremental Non-Gaussian Inference for SLAM Using Normalizing Flows

Qiangqiang Huang, Can Pu, Kasra Khosoussi et al.

This paper presents normalizing flows for incremental smoothing and mapping (NF-iSAM), a novel algorithm for inferring the full posterior distribution in SLAM problems with nonlinear measurement models and non-Gaussian factors. NF-iSAM exploits the expressive power of neural networks, and trains normalizing flows to model and sample the full posterior. By leveraging the Bayes tree, NF-iSAM enables efficient incremental updates similar to iSAM2, albeit in the more challenging non-Gaussian setting. We demonstrate the advantages of NF-iSAM over state-of-the-art point and distribution estimation algorithms using range-only SLAM problems with data association ambiguity. NF-iSAM presents superior accuracy in describing the posterior beliefs of continuous variables (e.g., position) and discrete variables (e.g., data association).