Rahul Mangharam

RO
h-index9
30papers
705citations
Novelty50%
AI Score56

30 Papers

SYJan 20, 2016Code
Data-Driven Modeling, Control and Tools for Cyber-Physical Energy Systems

Madhur Behl, Achin Jain, Rahul Mangharam

Demand response (DR) is becoming increasingly important as the volatility on the grid continues to increase. Current DR approaches are completely manual and rule-based or involve deriving first principles based models which are extremely cost and time prohibitive to build. We consider the problem of data-driven end-user DR for large buildings which involves predicting the demand response baseline, evaluating fixed rule based DR strategies and synthesizing DR control actions. We provide a model based control with regression trees algorithm (mbCRT), which allows us to perform closed-loop control for DR strategy synthesis for large commercial buildings. Our data-driven control synthesis algorithm outperforms rule-based DR by $17\%$ for a large DoE commercial reference building and leads to a curtailment of $380$kW and over $\$45,000$ in savings. Our methods have been integrated into an open source tool called DR-Advisor, which acts as a recommender system for the building's facilities manager and provides suitable control actions to meet the desired load curtailment while maintaining operations and maximizing the economic reward. DR-Advisor achieves $92.8\%$ to $98.9\%$ prediction accuracy for 8 buildings on Penn's campus. We compare DR-Advisor with other data driven methods and rank $2^{nd}$ on ASHRAE's benchmarking data-set for energy prediction.

77.8SIMay 26
Multiagent Social Influence: Modeling Persuasion in Contested Social Networks

Renukanandan Tumu, Cristian Ioan Vasile, Victor Preciado et al.

We present the Social Influence Game (SIG), a framework for modeling adversarial persuasion in social networks with an arbitrary number of competing players. Our goal is to provide a tractable and interpretable model of contested influence that scales to large systems while capturing the structural leverage points of networks. Each player allocates influence from a fixed budget to steer opinions that evolve under DeGroot dynamics, and we prove that the resulting optimization problem is a difference-of-convex program. To enable scalability, we develop an Iterated Linear (IL) solver that approximates player objectives with linear programs. In experiments on random and archetypical networks, IL achieves solutions within 7% of nonlinear solvers while being over 10x faster, scaling to large social networks. This paper lays a foundation for asymptotic analysis of contested influence in complex networks.

SYApr 1, 2023
Safe Perception-Based Control under Stochastic Sensor Uncertainty using Conformal Prediction

Shuo Yang, George J. Pappas, Rahul Mangharam et al.

We consider perception-based control using state estimates that are obtained from high-dimensional sensor measurements via learning-enabled perception maps. However, these perception maps are not perfect and result in state estimation errors that can lead to unsafe system behavior. Stochastic sensor noise can make matters worse and result in estimation errors that follow unknown distributions. We propose a perception-based control framework that i) quantifies estimation uncertainty of perception maps, and ii) integrates these uncertainty representations into the control design. To do so, we use conformal prediction to compute valid state estimation regions, which are sets that contain the unknown state with high probability. We then devise a sampled-data controller for continuous-time systems based on the notion of measurement robust control barrier functions. Our controller uses idea from self-triggered control and enables us to avoid using stochastic calculus. Our framework is agnostic to the choice of the perception map, independent of the noise distribution, and to the best of our knowledge the first to provide probabilistic safety guarantees in such a setting. We demonstrate the effectiveness of our proposed perception-based controller for a LiDAR-enabled F1/10th car.

ROFeb 2, 2023
Physics Constrained Motion Prediction with Uncertainty Quantification

Renukanandan Tumu, Lars Lindemann, Truong Nghiem et al.

Predicting the motion of dynamic agents is a critical task for guaranteeing the safety of autonomous systems. A particular challenge is that motion prediction algorithms should obey dynamics constraints and quantify prediction uncertainty as a measure of confidence. We present a physics-constrained approach for motion prediction which uses a surrogate dynamical model to ensure that predicted trajectories are dynamically feasible. We propose a two-step integration consisting of intent and trajectory prediction subject to dynamics constraints. We also construct prediction regions that quantify uncertainty and are tailored for autonomous driving by using conformal prediction, a popular statistical tool. Physics Constrained Motion Prediction achieves a 41% better ADE, 56% better FDE, and 19% better IoU over a baseline in experiments using an autonomous racing dataset.

LGMar 1, 2023
MEGA-DAgger: Imitation Learning with Multiple Imperfect Experts

Xiatao Sun, Shuo Yang, Mingyan Zhou et al.

Imitation learning has been widely applied to various autonomous systems thanks to recent development in interactive algorithms that address covariate shift and compounding errors induced by traditional approaches like behavior cloning. However, existing interactive imitation learning methods assume access to one perfect expert. Whereas in reality, it is more likely to have multiple imperfect experts instead. In this paper, we propose MEGA-DAgger, a new DAgger variant that is suitable for interactive learning with multiple imperfect experts. First, unsafe demonstrations are filtered while aggregating the training data, so the imperfect demonstrations have little influence when training the novice policy. Next, experts are evaluated and compared on scenarios-specific metrics to resolve the conflicted labels among experts. Through experiments in autonomous racing scenarios, we demonstrate that policy learned using MEGA-DAgger can outperform both experts and policies learned using the state-of-the-art interactive imitation learning algorithms such as Human-Gated DAgger. The supplementary video can be found at \url{https://youtu.be/wPCht31MHrw}.

SYSep 20, 2022
Differentiable Safe Controller Design through Control Barrier Functions

Shuo Yang, Shaoru Chen, Victor M. Preciado et al.

Learning-based controllers, such as neural network (NN) controllers, can show high empirical performance but lack formal safety guarantees. To address this issue, control barrier functions (CBFs) have been applied as a safety filter to monitor and modify the outputs of learning-based controllers in order to guarantee the safety of the closed-loop system. However, such modification can be myopic with unpredictable long-term effects. In this work, we propose a safe-by-construction NN controller which employs differentiable CBF-based safety layers, and investigate the performance of safe-by-construction NN controllers in learning-based control. Specifically, two formulations of controllers are compared: one is projection-based and the other relies on our proposed set-theoretic parameterization. Both methods demonstrate improved closed-loop performance over using CBF as a separate safety filter in numerical experiments.

AIJun 11, 2023
Multi-Agent Reinforcement Learning Guided by Signal Temporal Logic Specifications

Jiangwei Wang, Shuo Yang, Ziyan An et al.

Reward design is a key component of deep reinforcement learning, yet some tasks and designer's objectives may be unnatural to define as a scalar cost function. Among the various techniques, formal methods integrated with DRL have garnered considerable attention due to their expressiveness and flexibility to define the reward and requirements for different states and actions of the agent. However, how to leverage Signal Temporal Logic (STL) to guide multi-agent reinforcement learning reward design remains unexplored. Complex interactions, heterogeneous goals and critical safety requirements in multi-agent systems make this problem even more challenging. In this paper, we propose a novel STL-guided multi-agent reinforcement learning framework. The STL requirements are designed to include both task specifications according to the objective of each agent and safety specifications, and the robustness values of the STL specifications are leveraged to generate rewards. We validate the advantages of our method through empirical studies. The experimental results demonstrate significant reward performance improvements compared to MARL without STL guidance, along with a remarkable increase in the overall safety rate of the multi-agent systems.

ROSep 19, 2023
Learning Adaptive Safety for Multi-Agent Systems

Luigi Berducci, Shuo Yang, Rahul Mangharam et al.

Ensuring safety in dynamic multi-agent systems is challenging due to limited information about the other agents. Control Barrier Functions (CBFs) are showing promise for safety assurance but current methods make strong assumptions about other agents and often rely on manual tuning to balance safety, feasibility, and performance. In this work, we delve into the problem of adaptive safe learning for multi-agent systems with CBF. We show how emergent behavior can be profoundly influenced by the CBF configuration, highlighting the necessity for a responsive and dynamic approach to CBF design. We present ASRL, a novel adaptive safe RL framework, to fully automate the optimization of policy and CBF coefficients, to enhance safety and long-term performance through reinforcement learning. By directly interacting with the other agents, ASRL learns to cope with diverse agent behaviours and maintains the cost violations below a desired limit. We evaluate ASRL in a multi-robot system and a competitive multi-agent racing scenario, against learning-based and control-theoretic approaches. We empirically demonstrate the efficacy and flexibility of ASRL, and assess generalization and scalability to out-of-distribution scenarios. Code and supplementary material are public online.

ROSep 16, 2022
Game-theoretic Objective Space Planning

Hongrui Zheng, Zhijun Zhuang, Johannes Betz et al.

Generating competitive strategies and performing continuous motion planning simultaneously in an adversarial setting is a challenging problem. In addition, understanding the intent of other agents is crucial to deploying autonomous systems in adversarial multi-agent environments. Existing approaches either discretize agent action by grouping similar control inputs, sacrificing performance in motion planning, or plan in uninterpretable latent spaces, producing hard-to-understand agent behaviors. Furthermore, the most popular policy optimization frameworks do not recognize the long-term effect of actions and become myopic. This paper proposes an agent action discretization method via abstraction that provides clear intentions of agent actions, an efficient offline pipeline of agent population synthesis, and a planning strategy using counterfactual regret minimization with function approximation. Finally, we experimentally validate our findings on scaled autonomous vehicles in a head-to-head racing setting. We demonstrate that using the proposed framework significantly improves learning, improves the win rate against different opponents, and the improvements can be transferred to unseen opponents in an unseen environment.

SYDec 26, 2015
Model Checking Implantable Cardioverter Defibrillators

Houssam Abbas, Kuk Jin Jang, Zhihao Jiang et al.

Ventricular Fibrillation is a disorganized electrical excitation of the heart that results in inadequate blood flow to the body. It usually ends in death within seconds. The most common way to treat the symptoms of fibrillation is to implant a medical device, known as an Implantable Cardioverter Defibrillator (ICD), in the patient's body. Model-based verification can supply rigorous proofs of safety and efficacy. In this paper, we build a hybrid system model of the human heart+ICD closed loop, and show it to be a STORMED system, a class of o-minimal hybrid systems that admit finite bisimulations. In general, it may not be possible to compute the bisimulation. We show that approximate reachability can yield a finite simulation for STORMED systems, which improves on the existing verification procedure. In the process, we show that certain compositions respect the STORMED property. Thus it is possible to model check important formal properties of ICDs in a closed loop with the heart, such as delayed therapy, missed therapy, or inappropriately administered therapy. The results of this paper are theoretical and motivate the creation of concrete model checking procedures for STORMED systems.

63.5SYApr 8
Failure-Aware Iterative Learning of State-Control Invariant Sets

Ahmad Amine, Nick-Marios T. Kokolakis, Ugo Rosolia et al.

In this paper, we address the problem of computing maximal state-control invariant sets using failing trajectories. We introduce the concept of state-control invariance, which extends control invariance from the state space to the joint state-control space. The maximal state-control invariant (MSCI) set simultaneously encodes the maximal control invariant set (MCI) and, for each state in the MCI, the set of control inputs that preserve invariance. We prove that the state projection of the MSCI is the MCI and the state-dependent sections of the MSCI are the admissible invariance-preserving inputs. Building on this framework, we develop a Failure-Aware Iterative Learning (FAIL) algorithm for deterministic linear time invariant systems with polytopic constraints. The algorithm iteratively updates a constraint set in the state-control space by learning predecessor halfspaces from one-step failing state-input pairs, without knowing the dynamics. For each failure, FAIL learns the violated halfspaces of the predecessor of the constraint set by a regression on failing trajectories. We prove that the learned constraint set converges monotonically to the MSCI. Numerical experiments on a double integrator system validate the proposed approach.

ROFeb 18
SIT-LMPC: Safe Information-Theoretic Learning Model Predictive Control for Iterative Tasks

Zirui Zang, Ahmad Amine, Nick-Marios T. Kokolakis et al.

Robots executing iterative tasks in complex, uncertain environments require control strategies that balance robustness, safety, and high performance. This paper introduces a safe information-theoretic learning model predictive control (SIT-LMPC) algorithm for iterative tasks. Specifically, we design an iterative control framework based on an information-theoretic model predictive control algorithm to address a constrained infinite-horizon optimal control problem for discrete-time nonlinear stochastic systems. An adaptive penalty method is developed to ensure safety while balancing optimality. Trajectories from previous iterations are utilized to learn a value function using normalizing flows, which enables richer uncertainty modeling compared to Gaussian priors. SIT-LMPC is designed for highly parallel execution on graphics processing units, allowing efficient real-time optimization. Benchmark simulations and hardware experiments demonstrate that SIT-LMPC iteratively improves system performance while robustly satisfying system constraints.

14.9ROMar 24
Small-Scale Testbeds for Connected and Automated Vehicles and Robot Swarms: Challenges and a Roadmap

Jianye Xu, Johannes Betz, Armin Mokhtarian et al.

This article proposes a roadmap to address the current challenges in small-scale testbeds for Connected and Automated Vehicles (CAVs) and robot swarms. The roadmap is a joint effort of participants in the workshop "1st Workshop on Small-Scale Testbeds for Connected and Automated Vehicles and Robot Swarms," held on June 2 at the IEEE Intelligent Vehicles Symposium (IV) 2024 in Jeju, South Korea. The roadmap contains three parts: 1) enhancing accessibility and diversity, especially for underrepresented communities, 2) sharing best practices for the development and maintenance of testbeds, and 3) connecting testbeds through an abstraction layer to support collaboration. The workshop features eight invited speakers, four contributed papers [1]-[4], and a presentation of a survey paper on testbeds [5]. The survey paper provides an online comparative table of more than 25 testbeds, available at https://bassamlab.github.io/testbeds-survey. The workshop's own website is available at https://cpm-remote.lrt.unibw-muenchen.de/iv24-workshop.

19.8SYMay 13
A Hybrid Learning-to-Optimize Framework for Mixed-Integer Quadratic Programming

Viet-Anh Le, Mu Xie, Rahul Mangharam

In this paper, we propose a learning-to-optimize (L2O) framework to accelerate solving parametric mixed-integer quadratic programming (MIQP) problems, with a particular focus on mixed-integer model predictive control (MI-MPC) applications. The framework learns to predict integer solutions with enhanced optimality and feasibility by integrating supervised learning (for optimality), self-supervised learning (for feasibility), and a differentiable quadratic programming (QP) layer, resulting in a hybrid L2O framework. Specifically, a neural network (NN) is used to learn the mapping from problem parameters to optimal integer solutions, while a differentiable QP layer is integrated to compute the corresponding continuous variables given the predicted integers and problem parameters. Moreover, a hybrid loss function is proposed, which combines a supervised loss with respect to the global optimal solution, and a self-supervised loss derived from the problem's objective and constraints. The effectiveness of the proposed framework is demonstrated on two benchmark MI-MPC problems, with comparative results against purely supervised and self-supervised learning models.

ROMar 6
STL-SVPIO: Signal Temporal Logic guided Stein Variational Path Integral Optimization

Hongrui Zheng, Zirui Zang, Ahmad Amine et al.

Signal Temporal Logic (STL) enables formal specification of complex spatiotemporal constraints for robotic task planning. However, synthesizing long-horizon continuous control trajectories from complex STL specifications is fundamentally challenging due to the nested structure of STL robustness objectives. Existing solver-based methods, such as Mixed-Integer Linear Programming (MILP), suffer from exponential scaling, whereas sampling methods, such as Model-Predictive Path Integral control (MPPI), struggle with sparse, long-horizon costs. We introduce Signal Temporal Logic guided Stein Variational Path Integral Optimization (STL-SVPIO), which reframes STL as a globally informative, differentiable reward-shaping mechanism. By leveraging Stein Variational Gradient Descent and differentiable physics engines, STL-SVPIO transports a mutually repulsive swarm of control particles toward high robustness regions. Our method transforms sparse logical satisfaction into tractable variational inference, mitigating the severe local minima traps of standard gradient-based methods. We demonstrate that STL-SVPIO significantly outperforms existing methods in both robustness and efficiency for traditional STL tasks. Moreover, it solves complex long-horizon tasks, including multi-agent coordination with synchronization and queuing while baselines either fail to discover feasible solutions, or become computationally intractable. Finally, we use STL-SVPIO in agile robotic motion planning tasks with nonlinear dynamics, such as 7-DoF manipulation and half cheetah back flips to show the generalizability of our algorithm.

41.7SYApr 2
Toward Single-Step MPPI via Differentiable Predictive Control

Viet-Anh Le, Renukanandan Tumu, Rahul Mangharam

Model predictive path integral (MPPI) is a sampling-based method for solving complex model predictive control (MPC) problems, but its real-time implementation faces two key challenges: the computational cost and sample requirements grow with the prediction horizon, and manually tuning the sampling covariance requires balancing exploration and noise. To address these issues, we propose Step-MPPI, a framework that learns a sampling distribution for efficient single-step lookahead MPPI implementation. Specifically, we use a neural network to parameterize the MPPI proposal distribution at each time step, and train it in a self-supervised manner over a long horizon using the MPC cost, constraint penalties, and a maximum-entropy regularization term. By embedding long-horizon objectives into training the neural distribution policy, Step-MPPI achieves the foresight of a multi-step optimizer with the millisecond-level latency of single-step lookahead. We demonstrate the efficiency of Step-MPPI across multiple challenging tasks in which MPPI suffers from high dimensionality and/or long control horizons.

LGFeb 2
AdaptNC: Adaptive Nonconformity Scores for Uncertainty-Aware Autonomous Systems in Dynamic Environments

Renukanandan Tumu, Aditya Singh, Rahul Mangharam

Rigorous uncertainty quantification is essential for the safe deployment of autonomous systems in unconstrained environments. Conformal Prediction (CP) provides a distribution-free framework for this task, yet its standard formulations rely on exchangeability assumptions that are violated by the distribution shifts inherent in real-world robotics. Existing online CP methods maintain target coverage by adaptively scaling the conformal threshold, but typically employ a static nonconformity score function. We show that this fixed geometry leads to highly conservative, volume-inefficient prediction regions when environments undergo structural shifts. To address this, we propose \textbf{AdaptNC}, a framework for the joint online adaptation of both the nonconformity score parameters and the conformal threshold. AdaptNC leverages an adaptive reweighting scheme to optimize score functions, and introduces a replay buffer mechanism to mitigate the coverage instability that occurs during score transitions. We evaluate AdaptNC on diverse robotic benchmarks involving multi-agent policy changes, environmental changes and sensor degradation. Our results demonstrate that AdaptNC significantly reduces prediction region volume compared to state-of-the-art threshold-only baselines while maintaining target coverage levels.

ROJan 24, 2019Code
F1/10: An Open-Source Autonomous Cyber-Physical Platform

Matthew O'Kelly, Varundev Sukhil, Houssam Abbas et al.

In 2005 DARPA labeled the realization of viable autonomous vehicles (AVs) a grand challenge; a short time later the idea became a moonshot that could change the automotive industry. Today, the question of safety stands between reality and solved. Given the right platform the CPS community is poised to offer unique insights. However, testing the limits of safety and performance on real vehicles is costly and hazardous. The use of such vehicles is also outside the reach of most researchers and students. In this paper, we present F1/10: an open-source, affordable, and high-performance 1/10 scale autonomous vehicle testbed. The F1/10 testbed carries a full suite of sensors, perception, planning, control, and networking software stacks that are similar to full scale solutions. We demonstrate key examples of the research enabled by the F1/10 testbed, and how the platform can be used to augment research and education in autonomous systems, making autonomy more accessible.

MAMar 25, 2024
Conformal Off-Policy Prediction for Multi-Agent Systems

Tom Kuipers, Renukanandan Tumu, Shuo Yang et al.

Off-Policy Prediction (OPP), i.e., predicting the outcomes of a target policy using only data collected under a nominal (behavioural) policy, is a paramount problem in data-driven analysis of safety-critical systems where the deployment of a new policy may be unsafe. To achieve dependable off-policy predictions, recent work on Conformal Off-Policy Prediction (COPP) leverage the conformal prediction framework to derive prediction regions with probabilistic guarantees under the target process. Existing COPP methods can account for the distribution shifts induced by policy switching, but are limited to single-agent systems and scalar outcomes (e.g., rewards). In this work, we introduce MA-COPP, the first conformal prediction method to solve OPP problems involving multi-agent systems, deriving joint prediction regions for all agents' trajectories when one or more ego agents change their policies. Unlike the single-agent scenario, this setting introduces higher complexity as the distribution shifts affect predictions for all agents, not just the ego agents, and the prediction task involves full multi-dimensional trajectories, not just reward values. A key contribution of MA-COPP is to avoid enumeration or exhaustive search of the output space of agent trajectories, which is instead required by existing COPP methods to construct the prediction region. We achieve this by showing that an over-approximation of the true joint prediction region (JPR) can be constructed, without enumeration, from the maximum density ratio of the JPR trajectories. We evaluate the effectiveness of MA-COPP in multi-agent systems from the PettingZoo library and the F1TENTH autonomous racing environment, achieving nominal coverage in higher dimensions and various shift settings.

LGDec 12, 2023
Multi-Modal Conformal Prediction Regions with Simple Structures by Optimizing Convex Shape Templates

Renukanandan Tumu, Matthew Cleaveland, Rahul Mangharam et al.

Conformal prediction is a statistical tool for producing prediction regions for machine learning models that are valid with high probability. A key component of conformal prediction algorithms is a \emph{non-conformity score function} that quantifies how different a model's prediction is from the unknown ground truth value. Essentially, these functions determine the shape and the size of the conformal prediction regions. While prior work has gone into creating score functions that produce multi-model prediction regions, such regions are generally too complex for use in downstream planning and control problems. We propose a method that optimizes parameterized \emph{shape template functions} over calibration data, which results in non-conformity score functions that produce prediction regions with minimum volume. Our approach results in prediction regions that are \emph{multi-modal}, so they can properly capture residuals of distributions that have multiple modes, and \emph{practical}, so each region is convex and can be easily incorporated into downstream tasks, such as a motion planner using conformal prediction regions. Our method applies to general supervised learning tasks, while we illustrate its use in time-series prediction. We provide a toolbox and present illustrative case studies of F16 fighter jets and autonomous vehicles, showing an up to $68\%$ reduction in prediction region area compared to a circular baseline region.

ROApr 20, 2024
PoseINN: Realtime Visual-based Pose Regression and Localization with Invertible Neural Networks

Zirui Zang, Ahmad Amine, Rahul Mangharam

Estimating ego-pose from cameras is an important problem in robotics with applications ranging from mobile robotics to augmented reality. While SOTA models are becoming increasingly accurate, they can still be unwieldy due to high computational costs. In this paper, we propose to solve the problem by using invertible neural networks (INN) to find the mapping between the latent space of images and poses for a given scene. Our model achieves similar performance to the SOTA while being faster to train and only requiring offline rendering of low-resolution synthetic data. By using normalizing flows, the proposed method also provides uncertainty estimation for the output. We also demonstrated the efficiency of this method by deploying the model on a mobile robot.

ROJan 26, 2024
Learning Local Control Barrier Functions for Hybrid Systems

Shuo Yang, Yu Chen, Xiang Yin et al.

Hybrid dynamical systems are ubiquitous as practical robotic applications often involve both continuous states and discrete switchings. Safety is a primary concern for hybrid robotic systems. Existing safety-critical control approaches for hybrid systems are either computationally inefficient, detrimental to system performance, or limited to small-scale systems. To amend these drawbacks, in this paper, we propose a learning-enabled approach to construct local Control Barrier Functions (CBFs) to guarantee the safety of a wide class of nonlinear hybrid dynamical systems. The end result is a safe neural CBF-based switching controller. Our approach is computationally efficient, minimally invasive to any reference controller, and applicable to large-scale systems. We empirically evaluate our framework and demonstrate its efficacy and flexibility through two robotic examples including a high-dimensional autonomous racing case, against other CBF-based approaches and model predictive control.

ROFeb 28, 2022
Gradient-free Multi-domain Optimization for Autonomous Systems

Hongrui Zheng, Johannes Betz, Rahul Mangharam

Autonomous systems are composed of several subsystems such as mechanical, propulsion, perception, planning and control. These are traditionally designed separately which makes performance optimization of the integrated system a significant challenge. In this paper, we study the problem of using gradient-free optimization methods to jointly optimize the multiple domains of an autonomous system to find the set of optimal architectures for both hardware and software. We specifically perform multi-domain, multi-parameter optimization on an autonomous vehicle to find the best decision-making process, motion planning and control algorithms, and the physical parameters for autonomous racing. We detail the multi-domain optimization scheme, benchmark with different core components, and provide insights for generalization to new autonomous systems. In addition, this paper provides a benchmark of the performances of six different gradient-free optimizers in three different operating environments. Our approach is validated with a case study where we describe the autonomous vehicle system architecture, optimization methods, and finally, provide an argument on gradient-free optimization being a powerful choice to improve the performance of autonomous systems in an integrated manner.

ROFeb 14, 2022
Autonomous Vehicles on the Edge: A Survey on Autonomous Vehicle Racing

Johannes Betz, Hongrui Zheng, Alexander Liniger et al.

The rising popularity of self-driving cars has led to the emergence of a new research field in the recent years: Autonomous racing. Researchers are developing software and hardware for high performance race vehicles which aim to operate autonomously on the edge of the vehicles limits: High speeds, high accelerations, low reaction times, highly uncertain, dynamic and adversarial environments. This paper represents the first holistic survey that covers the research in the field of autonomous racing. We focus on the field of autonomous racecars only and display the algorithms, methods and approaches that are used in the fields of perception, planning and control as well as end-to-end learning. Further, with an increasing number of autonomous racing competitions, researchers now have access to a range of high performance platforms to test and evaluate their autonomy algorithms. This survey presents a comprehensive overview of the current autonomous racing platforms emphasizing both the software-hardware co-evolution to the current stage. Finally, based on additional discussion with leading researchers in the field we conclude with a summary of open research challenges that will guide future researchers in this field.

ROOct 3, 2021
Stress Testing Autonomous Racing Overtake Maneuvers with RRT

Stanley Bak, Johannes Betz, Abhinav Chawla et al.

High-performance autonomy often must operate at the boundaries of safety. When external agents are present in a system, the process of ensuring safety without sacrificing performance becomes extremely difficult. In this paper, we present an approach to stress test such systems based on the rapidly exploring random tree (RRT) algorithm. We propose to find faults in such systems through adversarial agent perturbations, where the behaviors of other agents in an otherwise fixed scenario are modified. This creates a large search space of possibilities, which we explore both randomly and with a focused strategy that runs RRT in a bounded projection of the observable states that we call the objective space. The approach is applied to generate tests for evaluating overtaking logic and path planning algorithms in autonomous racing, where the vehicles are driving at high speed in an adversarial environment. We evaluate several autonomous racing path planners, finding numerous collisions during overtake maneuvers in all planners. The focused RRT search finds several times more crashes than the random strategy, and, for certain planners, tens to hundreds of times more crashes in the second half of the track.

ROJul 20, 2021
Track based Offline Policy Learning for Overtaking Maneuvers with Autonomous Racecars

Jayanth Bhargav, Johannes Betz, Hongrui Zheng et al.

The rising popularity of driver-less cars has led to the research and development in the field of autonomous racing, and overtaking in autonomous racing is a challenging task. Vehicles have to detect and operate at the limits of dynamic handling and decisions in the car have to be made at high speeds and high acceleration. One of the most crucial parts in autonomous racing is path planning and decision making for an overtaking maneuver with a dynamic opponent vehicle. In this paper we present the evaluation of a track based offline policy learning approach for autonomous racing. We define specific track portions and conduct offline experiments to evaluate the probability of an overtaking maneuver based on speed and position of the ego vehicle. Based on these experiments we can define overtaking probability distributions for each of the track portions. Further, we propose a switching MPCC controller setup for incorporating the learnt policies to achieve a higher rate of overtaking maneuvers. By exhaustive simulations, we show that our proposed algorithm is able to increase the number of overtakes at different track portions.

SYJan 25, 2021
Learning-'N-Flying: A Learning-based, Decentralized Mission Aware UAS Collision Avoidance Scheme

Alëna Rodionova, Yash Vardhan Pant, Connor Kurtz et al.

Urban Air Mobility, the scenario where hundreds of manned and Unmanned Aircraft System (UAS) carry out a wide variety of missions (e.g. moving humans and goods within the city), is gaining acceptance as a transportation solution of the future. One of the key requirements for this to happen is safely managing the air traffic in these urban airspaces. Due to the expected density of the airspace, this requires fast autonomous solutions that can be deployed online. We propose Learning-'N-Flying (LNF) a multi-UAS Collision Avoidance (CA) framework. It is decentralized, works on-the-fly and allows autonomous UAS managed by different operators to safely carry out complex missions, represented using Signal Temporal Logic, in a shared airspace. We initially formulate the problem of predictive collision avoidance for two UAS as a mixed-integer linear program, and show that it is intractable to solve online. Instead, we first develop Learning-to-Fly (L2F) by combining: a) learning-based decision-making, and b) decentralized convex optimization-based control. LNF extends L2F to cases where there are more than two UAS on a collision path. Through extensive simulations, we show that our method can run online (computation time in the order of milliseconds), and under certain assumptions has failure rates of less than 1% in the worst-case, improving to near 0% in more relaxed operations. We show the applicability of our scheme to a wide variety of settings through multiple case studies.

SYJun 23, 2020
Learning-to-Fly: Learning-based Collision Avoidance for Scalable Urban Air Mobility

Alëna Rodionova, Yash Vardhan Pant, Kuk Jang et al.

With increasing urban population, there is global interest in Urban Air Mobility (UAM), where hundreds of autonomous Unmanned Aircraft Systems (UAS) execute missions in the airspace above cities. Unlike traditional human-in-the-loop air traffic management, UAM requires decentralized autonomous approaches that scale for an order of magnitude higher aircraft densities and are applicable to urban settings. We present Learning-to-Fly (L2F), a decentralized on-demand airborne collision avoidance framework for multiple UAS that allows them to independently plan and safely execute missions with spatial, temporal and reactive objectives expressed using Signal Temporal Logic. We formulate the problem of predictively avoiding collisions between two UAS without violating mission objectives as a Mixed Integer Linear Program (MILP).This however is intractable to solve online. Instead, we develop L2F, a two-stage collision avoidance method that consists of: 1) a learning-based decision-making scheme and 2) a distributed, linear programming-based UAS control algorithm. Through extensive simulations, we show the real-time applicability of our method which is $\approx\!6000\times$ faster than the MILP approach and can resolve $100\%$ of collisions when there is ample room to maneuver, and shows graceful degradation in performance otherwise. We also compare L2F to two other methods and demonstrate an implementation on quad-rotor robots.

LGMar 9, 2020
FormulaZero: Distributionally Robust Online Adaptation via Offline Population Synthesis

Aman Sinha, Matthew O'Kelly, Hongrui Zheng et al.

Balancing performance and safety is crucial to deploying autonomous vehicles in multi-agent environments. In particular, autonomous racing is a domain that penalizes safe but conservative policies, highlighting the need for robust, adaptive strategies. Current approaches either make simplifying assumptions about other agents or lack robust mechanisms for online adaptation. This work makes algorithmic contributions to both challenges. First, to generate a realistic, diverse set of opponents, we develop a novel method for self-play based on replica-exchange Markov chain Monte Carlo. Second, we propose a distributionally robust bandit optimization procedure that adaptively adjusts risk aversion relative to uncertainty in beliefs about opponents' behaviors. We rigorously quantify the tradeoffs in performance and robustness when approximating these computations in real-time motion-planning, and we demonstrate our methods experimentally on autonomous vehicles that achieve scaled speeds comparable to Formula One racecars.

SYOct 9, 2018
Synthesizing Stealthy Reprogramming Attacks on Cardiac Devices

Nicola Paoletti, Zhihao Jiang, Md Ariful Islam et al.

An Implantable Cardioverter Defibrillator (ICD) is a medical device used for the detection of potentially fatal cardiac arrhythmia and their treatment through the delivery of electrical shocks intended to restore normal heart rhythm. An ICD reprogramming attack seeks to alter the device's parameters to induce unnecessary shocks and, even more egregious, prevent required therapy. In this paper, we present a formal approach for the synthesis of ICD reprogramming attacks that are both effective, i.e., lead to fundamental changes in the required therapy, and stealthy, i.e., involve minimal changes to the nominal ICD parameters. We focus on the discrimination algorithm underlying Boston Scientific devices (one of the principal ICD manufacturers) and formulate the synthesis problem as one of multi-objective optimization. Our solution technique is based on an Optimization Modulo Theories encoding of the problem and allows us to derive device parameters that are optimal with respect to the effectiveness-stealthiness tradeoff (i.e., lie along the corresponding Pareto front). To the best of our knowledge, our work is the first to derive systematic ICD reprogramming attacks designed to maximize therapy disruption while minimizing detection. To evaluate our technique, we employ an extensive dataset of synthetic EGMs (cardiac signals), each generated with a prescribed arrhythmia, allowing us to synthesize attacks tailored to the victim's cardiac condition. Our approach readily generalizes to unseen signals, representing the unknown EGM of the victim patient.