LGApr 4, 2022
Test Against High-Dimensional Uncertainties: Accelerated Evaluation of Autonomous Vehicles with Deep Importance SamplingMansur Arief, Zhepeng Cen, Zhenyuan Liu et al. · cmu
Evaluating the performance of autonomous vehicles (AV) and their complex subsystems to high precision under naturalistic circumstances remains a challenge, especially when failure or dangerous cases are rare. Rarity does not only require an enormous sample size for a naive method to achieve high confidence estimation, but it also causes dangerous underestimation of the true failure rate and it is extremely hard to detect. Meanwhile, the state-of-the-art approach that comes with a correctness guarantee can only compute an upper bound for the failure rate under certain conditions, which could limit its practical uses. In this work, we present Deep Importance Sampling (Deep IS) framework that utilizes a deep neural network to obtain an efficient IS that is on par with the state-of-the-art, capable of reducing the required sample size 43 times smaller than the naive sampling method to achieve 10% relative error and while producing an estimate that is much less conservative. Our high-dimensional experiment estimating the misclassification rate of one of the state-of-the-art traffic sign classifiers further reveals that this efficiency still holds true even when the target is very small, achieving over 600 times efficiency boost. This highlights the potential of Deep IS in providing a precise estimate even against high-dimensional uncertainties.
CVSep 16, 2023
Enhancing Visual Perception in Novel Environments via Incremental Data Augmentation Based on Style TransferAbhibha Gupta, Rully Agus Hendrawan, Mansur Arief · cmu
The deployment of autonomous agents in real-world scenarios is challenged by "unknown unknowns", i.e. novel unexpected environments not encountered during training, such as degraded signs. While existing research focuses on anomaly detection and class imbalance, it often fails to address truly novel scenarios. Our approach enhances visual perception by leveraging the Variational Prototyping Encoder (VPE) to adeptly identify and handle novel inputs, then incrementally augmenting data using neural style transfer to enrich underrepresented data. By comparing models trained solely on original datasets with those trained on a combination of original and augmented datasets, we observed a notable improvement in the performance of the latter. This underscores the critical role of data augmentation in enhancing model robustness. Our findings suggest the potential benefits of incorporating generative models for domain-specific augmentation strategies.
SYDec 1, 2025
AI-Driven Optimization under Uncertainty for Mineral Processing OperationsWilliam Xu, Amir Eskanlou, Mansur Arief et al.
The global capacity for mineral processing must expand rapidly to meet the demand for critical minerals, which are essential for building the clean energy technologies necessary to mitigate climate change. However, the efficiency of mineral processing is severely limited by uncertainty, which arises from both the variability of feedstock and the complexity of process dynamics. To optimize mineral processing circuits under uncertainty, we introduce an AI-driven approach that formulates mineral processing as a Partially Observable Markov Decision Process (POMDP). We demonstrate the capabilities of this approach in handling both feedstock uncertainty and process model uncertainty to optimize the operation of a simulated, simplified flotation cell as an example. We show that by integrating the process of information gathering (i.e., uncertainty reduction) and process optimization, this approach has the potential to consistently perform better than traditional approaches at maximizing an overall objective, such as net present value (NPV). Our methodological demonstration of this optimization-under-uncertainty approach for a synthetic case provides a mathematical and computational framework for later real-world application, with the potential to improve both the laboratory-scale design of experiments and industrial-scale operation of mineral processing circuits without any additional hardware.
ROJul 22, 2024
Importance Sampling-Guided Meta-Training for Intelligent Agents in Highly Interactive EnvironmentsMansur Arief, Mike Timmerman, Jiachen Li et al.
Training intelligent agents to navigate highly interactive environments presents significant challenges. While guided meta reinforcement learning (RL) approach that first trains a guiding policy to train the ego agent has proven effective in improving generalizability across scenarios with various levels of interaction, the state-of-the-art method tends to be overly sensitive to extreme cases, impairing the agents' performance in the more common scenarios. This study introduces a novel training framework that integrates guided meta RL with importance sampling (IS) to optimize training distributions iteratively for navigating highly interactive driving scenarios, such as T-intersections or roundabouts. Unlike traditional methods that may underrepresent critical interactions or overemphasize extreme cases during training, our approach strategically adjusts the training distribution towards more challenging driving behaviors using IS proposal distributions and applies the importance ratio to de-bias the result. By estimating a naturalistic distribution from real-world datasets and employing a mixture model for iterative training refinements, the framework ensures a balanced focus across common and extreme driving scenarios. Experiments conducted with both synthetic and naturalistic datasets demonstrate both accelerated training and performance improvements under highly interactive driving tasks.
ROApr 15
Optimized Human-Robot Co-Dispatch Planning for Petro-Site Surveillance under Varying CriticalitiesNur Ahmad Khatim, Mansur Arief
Securing petroleum infrastructure requires balancing autonomous system efficiency with human judgment for threat escalation, a challenge unaddressed by classical facility location models assuming homogeneous resources. This paper formulates the Human-Robot Co-Dispatch Facility Location Problem (HRCD-FLP), a capacitated facility location variant incorporating tiered infrastructure criticality, human-robot supervision ratio constraints, and minimum utilization requirements. We evaluate command center selection across three technology maturity scenarios. Results show transitioning from conservative (1:3 human-robot supervision) to future autonomous operations (1:10) yields significant cost reduction while maintaining complete critical infrastructure coverage. For small problems, exact methods dominate in both cost and computation time; for larger problems, the proposed heuristic achieves feasible solutions in under 3 minutes with approximately 14% optimality gap where comparison is possible. From systems perspective, our work demonstrate that optimized planning for human-robot teaming is key to achieve both cost-effective and mission-reliable deployments.
ROJun 20, 2025
General-Purpose Robotic Navigation via LVLM-Orchestrated Perception, Reasoning, and ActingBernard Lange, Anil Yildiz, Mansur Arief et al.
Developing general-purpose navigation policies for unknown environments remains a core challenge in robotics. Most existing systems rely on task-specific neural networks and fixed information flows, limiting their generalizability. Large Vision-Language Models (LVLMs) offer a promising alternative by embedding human-like knowledge for reasoning and planning, but prior LVLM-robot integrations have largely depended on pre-mapped spaces, hard-coded representations, and rigid control logic. We introduce the Agentic Robotic Navigation Architecture (ARNA), a general-purpose framework that equips an LVLM-based agent with a library of perception, reasoning, and navigation tools drawn from modern robotic stacks. At runtime, the agent autonomously defines and executes task-specific workflows that iteratively query modules, reason over multimodal inputs, and select navigation actions. This agentic formulation enables robust navigation and reasoning in previously unmapped environments, offering a new perspective on robotic stack design. Evaluated in Habitat Lab on the HM-EQA benchmark, ARNA outperforms state-of-the-art EQA-specific approaches. Qualitative results on RxR and custom tasks further demonstrate its ability to generalize across a broad range of navigation challenges.
SYNov 12, 2024
Optimal Control of Mechanical Ventilators with Learned Respiratory DynamicsIsaac Ronald Ward, Dylan M. Asmar, Mansur Arief et al.
Deciding on appropriate mechanical ventilator management strategies significantly impacts the health outcomes for patients with respiratory diseases. Acute Respiratory Distress Syndrome (ARDS) is one such disease that requires careful ventilator operation to be effectively treated. In this work, we frame the management of ventilators for patients with ARDS as a sequential decision making problem using the Markov decision process framework. We implement and compare controllers based on clinical guidelines contained in the ARDSnet protocol, optimal control theory, and learned latent dynamics represented as neural networks. The Pulse Physiology Engine's respiratory dynamics simulator is used to establish a repeatable benchmark, gather simulated data, and quantitatively compare these controllers. We score performance in terms of measured improvement in established ARDS health markers (pertaining to improved respiratory rate, oxygenation, and vital signs). Our results demonstrate that techniques leveraging neural networks and optimal control can automatically discover effective ventilation management strategies without access to explicit ventilator management procedures or guidelines (such as those defined in the ARDSnet protocol).
SYAug 15, 2025
Optimized Renewable Energy Planning MDP for Socially-Equitable Electricity Coverage in the USRiya Kinnarkar, Mansur Arief
Traditional power grid infrastructure presents significant barriers to renewable energy integration and perpetuates energy access inequities, with low-income communities experiencing disproportionately longer power outages. This study develops a Markov Decision Process (MDP) framework to optimize renewable energy allocation while explicitly addressing social equity concerns in electricity distribution. The model incorporates budget constraints, energy demand variability, and social vulnerability indicators across eight major U.S. cities to evaluate policy alternatives for equitable clean energy transitions. Numerical experiments compare the MDP-based approach against baseline policies including random allocation, greedy renewable expansion, and expert heuristics. Results demonstrate that equity-focused optimization can achieve 32.9% renewable energy penetration while reducing underserved low-income populations by 55% compared to conventional approaches. The expert policy achieved the highest reward, while the Monte Carlo Tree Search baseline provided competitive performance with significantly lower budget utilization, demonstrating that fair distribution of clean energy resources is achievable without sacrificing overall system performance and providing ways for integrating social equity considerations with climate goals and inclusive access to clean power infrastructure.
AIFeb 8, 2025
Managing Geological Uncertainty in Critical Mineral Supply Chains: A POMDP Approach with Application to U.S. Lithium ResourcesMansur Arief, Yasmine Alonso, CJ Oshiro et al.
The world is entering an unprecedented period of critical mineral demand, driven by the global transition to renewable energy technologies and electric vehicles. This transition presents unique challenges in mineral resource development, particularly due to geological uncertainty-a key characteristic that traditional supply chain optimization approaches do not adequately address. To tackle this challenge, we propose a novel application of Partially Observable Markov Decision Processes (POMDPs) that optimizes critical mineral sourcing decisions while explicitly accounting for the dynamic nature of geological uncertainty. Through a case study of the U.S. lithium supply chain, we demonstrate that POMDP-based policies achieve superior outcomes compared to traditional approaches, especially when initial reserve estimates are imperfect. Our framework provides quantitative insights for balancing domestic resource development with international supply diversification, offering policymakers a systematic approach to strategic decision-making in critical mineral supply chains.
ROJun 20, 2024
Diffusion-Based Failure Sampling for Evaluating Safety-Critical Autonomous SystemsHarrison Delecki, Marc R. Schlichting, Mansur Arief et al.
Validating safety-critical autonomous systems in high-dimensional domains such as robotics presents a significant challenge. Existing black-box approaches based on Markov chain Monte Carlo may require an enormous number of samples, while methods based on importance sampling often rely on simple parametric families that may struggle to represent the distribution over failures. We propose to sample the distribution over failures using a conditional denoising diffusion model, which has shown success in complex high-dimensional problems such as robotic task planning. We iteratively train a diffusion model to produce state trajectories closer to failure. We demonstrate the effectiveness of our approach on high-dimensional robotic validation tasks, improving sample efficiency and mode coverage compared to existing black-box techniques.
ROFeb 4, 2022
A Survey on Safety-Critical Driving Scenario Generation -- A Methodological PerspectiveWenhao Ding, Chejian Xu, Mansur Arief et al.
Autonomous driving systems have witnessed a significant development during the past years thanks to the advance in machine learning-enabled sensing and decision-making algorithms. One critical challenge for their massive deployment in the real world is their safety evaluation. Most existing driving systems are still trained and evaluated on naturalistic scenarios collected from daily life or heuristically-generated adversarial ones. However, the large population of cars, in general, leads to an extremely low collision rate, indicating that the safety-critical scenarios are rare in the collected real-world data. Thus, methods to artificially generate scenarios become crucial to measure the risk and reduce the cost. In this survey, we focus on the algorithms of safety-critical scenario generation in autonomous driving. We first provide a comprehensive taxonomy of existing algorithms by dividing them into three categories: data-driven generation, adversarial generation, and knowledge-based generation. Then, we discuss useful tools for scenario generation, including simulation platforms and packages. Finally, we extend our discussion to five main challenges of current works -- fidelity, efficiency, diversity, transferability, controllability -- and research opportunities lighted up by these challenges.
MENov 3, 2021
Certifiable Deep Importance Sampling for Rare-Event Simulation of Black-Box SystemsMansur Arief, Yuanlu Bai, Wenhao Ding et al.
Rare-event simulation techniques, such as importance sampling (IS), constitute powerful tools to speed up challenging estimation of rare catastrophic events. These techniques often leverage the knowledge and analysis on underlying system structures to endow desirable efficiency guarantees. However, black-box problems, especially those arising from recent safety-critical applications of AI-driven physical systems, can fundamentally undermine their efficiency guarantees and lead to dangerous under-estimation without diagnostically detected. We propose a framework called Deep Probabilistic Accelerated Evaluation (Deep-PrAE) to design statistically guaranteed IS, by converting black-box samplers that are versatile but could lack guarantees, into one with what we call a relaxed efficiency certificate that allows accurate estimation of bounds on the rare-event probability. We present the theory of Deep-PrAE that combines the dominating point concept with rare-event set learning via deep neural network classifiers, and demonstrate its effectiveness in numerical examples including the safety-testing of intelligent driving algorithms.
LGJun 28, 2020
Deep Probabilistic Accelerated Evaluation: A Robust Certifiable Rare-Event Simulation Methodology for Black-Box Safety-Critical SystemsMansur Arief, Zhiyuan Huang, Guru Koushik Senthil Kumar et al.
Evaluating the reliability of intelligent physical systems against rare safety-critical events poses a huge testing burden for real-world applications. Simulation provides a useful platform to evaluate the extremal risks of these systems before their deployments. Importance Sampling (IS), while proven to be powerful for rare-event simulation, faces challenges in handling these learning-based systems due to their black-box nature that fundamentally undermines its efficiency guarantee, which can lead to under-estimation without diagnostically detected. We propose a framework called Deep Probabilistic Accelerated Evaluation (Deep-PrAE) to design statistically guaranteed IS, by converting black-box samplers that are versatile but could lack guarantees, into one with what we call a relaxed efficiency certificate that allows accurate estimation of bounds on the safety-critical event probability. We present the theory of Deep-PrAE that combines the dominating point concept with rare-event set learning via deep neural network classifiers, and demonstrate its effectiveness in numerical examples including the safety-testing of an intelligent driving algorithm.
ROSep 19, 2019
How to Evaluate Proving Grounds for Self-Driving? A Quantitative ApproachRui Chen, Mansur Arief, Weiyang Zhang et al.
Proving ground has been a critical component in testing and validation for Connected and Automated Vehicles (CAV). Although quite a few world-class testing facilities have been under construction over the years, the evaluation of proving grounds themselves as testing approaches has rarely been studied. In this paper, we present the first attempt to systematically evaluate CAV proving grounds and contribute to a generative sample-based approach to assessing the representation of traffic scenarios in proving grounds. Leveraging typical use cases extracted from naturalistic driving events, we establish a strong link between proving ground testing results of CAVs and their anticipated public street performance. We present benchmark results of our approach on three world-class CAV testing facilities: Mcity, Almono (Uber ATG), and Kcity. We successfully show the overall evaluation of these proving grounds in terms of their capability to accommodate real-world traffic scenarios. We believe that when the effectiveness of a testing ground itself is validated, the testing results would grant more confidence for CAV public deployment.
ROJun 25, 2019
Modeling Multi-Vehicle Interaction Scenarios Using Gaussian Random FieldYaohui Guo, Vinay Varma Kalidindi, Mansur Arief et al.
Autonomous vehicles are expected to navigate in complex traffic scenarios with multiple surrounding vehicles. The correlations between road users vary over time, the degree of which, in theory, could be infinitely large, thus posing a great challenge in modeling and predicting the driving environment. In this paper, we propose a method to model multi-vehicle interactions using a stochastic vector field model and apply non-parametric Bayesian learning to extract the underlying motion patterns from a large quantity of naturalistic traffic data. We then use this model to reproduce the high-dimensional driving scenarios in a finitely tractable form. We use a Gaussian process to model multi-vehicle motion, and a Dirichlet process to assign each observation to a specific scenario. We verify the effectiveness of the proposed method on highway and intersection datasets from the NGSIM project, in which complex multi-vehicle interactions are prevalent. The results show that the proposed method can capture motion patterns from both settings, without imposing heroic prior, and hence demonstrate the potential application for a wide array of traffic situations. The proposed modeling method could enable simulation platforms and other testing methods designed for autonomous vehicle evaluation, to easily model and generate traffic scenarios emulating large scale driving data.
MEApr 19, 2019
Evaluation Uncertainty in Data-Driven Self-Driving TestingZhiyuan Huang, Mansur Arief, Henry Lam et al.
Safety evaluation of self-driving technologies has been extensively studied. One recent approach uses Monte Carlo based evaluation to estimate the occurrence probabilities of safety-critical events as safety measures. These Monte Carlo samples are generated from stochastic input models constructed based on real-world data. In this paper, we propose an approach to assess the impact on the probability estimates from the evaluation procedures due to the estimation error caused by data variability. Our proposed method merges the classical bootstrap method for estimating input uncertainty with a likelihood ratio based scheme to reuse experiment outputs. This approach is economical and efficient in terms of implementation costs in assessing input uncertainty for the evaluation of self-driving technology. We use an example in autonomous vehicle (AV) safety evaluation to demonstrate the proposed approach as a diagnostic tool for the quality of the fitted input model.
ROSep 16, 2018
Where Should We Place LiDARs on the Autonomous Vehicle? - An Optimal Design ApproachZuxin Liu, Mansur Arief, Ding Zhao
Autonomous vehicle manufacturers recognize that LiDAR provides accurate 3D views and precise distance measures under highly uncertain driving conditions. Its practical implementation, however, remains costly. This paper investigates the optimal LiDAR configuration problem to achieve utility maximization. We use the perception area and non-detectable subspace to construct the design procedure as solving a min-max optimization problem and propose a bio-inspired measure -- volume to surface area ratio (VSR) -- as an easy-to-evaluate cost function representing the notion of the size of the non-detectable subspaces of a given configuration. We then adopt a cuboid-based approach to show that the proposed VSR-based measure is a well-suited proxy for object detection rate. It is found that the Artificial Bee Colony evolutionary algorithm yields a tractable cost function computation. Our experiments highlight the effectiveness of our proposed VSR measure in terms of cost-effectiveness configuration as well as providing insightful analyses that can improve the design of AV systems.
ROAug 9, 2018
An "Xcity" Optimization Approach to Designing Proving Grounds for Connected and Autonomous VehiclesRui Chen, Mansur Arief, Ding Zhao
Proving ground, or on-track testing has been an essential part of testing and validation process for connected and autonomous vehicles (CAV). Several world-class CAV proving grounds, such as Mcity at the University of Michigan and The Castle of Waymo, have already been built, and many more are currently under construction. In this paper, we propose the first optimization approach to CAV proving ground designing and refer to any such CAV-centric design problem as "Xcity" to emphasize the enormous investment, the multi-dimensional spatial consideration, and the immense construction effort emerging globally. Inspired by the recent progress on traffic encounter clustering, we further define "road assets" as fundamental building blocks and formulate the whole design process into nonlinear optimization problems. We have shown that such framework can be utilized to adaptively generate CAV proving ground designs with optimized capability and flexibility and can further be extended to evaluate an existing "Xcity" design.
ROMay 5, 2018
An Accelerated Approach to Safely and Efficiently Test Pre-Production Autonomous Vehicles on Public StreetsMansur Arief, Peter Glynn, Ding Zhao
Various automobile and mobility companies, for instance Ford, Uber and Waymo, are currently testing their pre-produced autonomous vehicle (AV) fleets on the public roads. However, due to rareness of the safety-critical cases and, effectively, unlimited number of possible traffic scenarios, these on-road testing efforts have been acknowledged as tedious, costly, and risky. In this study, we propose Accelerated De- ployment framework to safely and efficiently estimate the AVs performance on public streets. We showed that by appropriately addressing the gradual accuracy improvement and adaptively selecting meaningful and safe environment under which the AV is deployed, the proposed framework yield to highly accurate estimation with much faster evaluation time, and more importantly, lower deployment risk. Our findings provide an answer to the currently heated and active discussions on how to properly test AV performance on public roads so as to achieve safe, efficient, and statistically-reliable testing framework for AV technologies.