Ransalu Senanayake

RO
Semantic Scholar Profile
h-index39
37papers
577citations
Novelty47%
AI Score55

37 Papers

LGMay 30
CUPID in the Model Zoo: Online Matchmaking for Selecting Your Dream LLM

Son Nguyen, Xinyuan Liu, Ransalu Senanayake

Users increasingly face the challenge of selecting an appropriate LLM for a given task from a rapidly growing pool of LLMs, each with distinct but often opaque latent properties. Compounding this challenge, users may lack the vocabulary or awareness to explicitly articulate the characteristics they value in an LLM's responses or deployment. We propose an interaction-efficient active learning framework in which a dueling bandit algorithm iteratively selects pairs of LLMs, collects user feedback about their responses, and updates its belief about the user's latent preferences. We introduce a novel belief-aware upper confidence bound strategy that balances exploration of the model pool with exploitation of inferred preferences, enabling efficient alignment between user needs and LLM capabilities under user-specified cost and time budgets. Through diverse experiments on LLMs and human studies, we experimentally verify that our model can efficiently match well-aligned LLMs to users at a lower cost.

LGJun 3, 2022
Disentangling Epistemic and Aleatoric Uncertainty in Reinforcement Learning

Bertrand Charpentier, Ransalu Senanayake, Mykel Kochenderfer et al.

Characterizing aleatoric and epistemic uncertainty on the predicted rewards can help in building reliable reinforcement learning (RL) systems. Aleatoric uncertainty results from the irreducible environment stochasticity leading to inherently risky states and actions. Epistemic uncertainty results from the limited information accumulated during learning to make informed decisions. Characterizing aleatoric and epistemic uncertainty can be used to speed up learning in a training environment, improve generalization to similar testing environments, and flag unfamiliar behavior in anomalous testing environments. In this work, we introduce a framework for disentangling aleatoric and epistemic uncertainty in RL. (1) We first define four desiderata that capture the desired behavior for aleatoric and epistemic uncertainty estimation in RL at both training and testing time. (2) We then present four RL models inspired by supervised learning (i.e. Monte Carlo dropout, ensemble, deep kernel learning models, and evidential networks) to instantiate aleatoric and epistemic uncertainty. Finally, (3) we propose a practical evaluation method to evaluate uncertainty estimation in model-free RL based on detection of out-of-distribution environments and generalization to perturbed environments. We present theoretical and experimental evidence to validate that carefully equipping model-free RL agents with supervised learning uncertainty methods can fulfill our desiderata.

ROMar 26, 2022
How Do We Fail? Stress Testing Perception in Autonomous Vehicles

Harrison Delecki, Masha Itkina, Bernard Lange et al.

Autonomous vehicles (AVs) rely on environment perception and behavior prediction to reason about agents in their surroundings. These perception systems must be robust to adverse weather such as rain, fog, and snow. However, validation of these systems is challenging due to their complexity and dependence on observation histories. This paper presents a method for characterizing failures of LiDAR-based perception systems for AVs in adverse weather conditions. We develop a methodology based in reinforcement learning to find likely failures in object tracking and trajectory prediction due to sequences of disturbances. We apply disturbances using a physics-based data augmentation technique for simulating LiDAR point clouds in adverse weather conditions. Experiments performed across a wide range of driving scenarios from a real-world driving dataset show that our proposed approach finds high likelihood failures with smaller input disturbances compared to baselines while remaining computationally tractable. Identified failures can inform future development of robust perception systems for AVs.

ROFeb 14, 2023
Graph Attention Multi-Agent Fleet Autonomy for Advanced Air Mobility

Malintha Fernando, Ransalu Senanayake, Heeyoul Choi et al.

Autonomous mobility is emerging as a new disruptive mode of urban transportation for moving cargo and passengers. However, designing scalable autonomous fleet coordination schemes to accommodate fast-growing mobility systems is challenging primarily due to the increasing heterogeneity of the fleets, time-varying demand patterns, service area expansions, and communication limitations. We introduce the concept of partially observable advanced air mobility games to coordinate a fleet of aerial vehicles by accounting for the heterogeneity of the interacting agents and the self-interested nature inherent to commercial mobility fleets. To model the complex interactions among the agents and the observation uncertainty in the mobility networks, we propose a novel heterogeneous graph attention encoder-decoder (HetGAT Enc-Dec) neural network-based stochastic policy. We train the policy by leveraging deep multi-agent reinforcement learning, allowing decentralized decision-making for the agents using their local observations. Through extensive experimentation, we show that the learned policy generalizes to various fleet compositions, demand patterns, and observation topologies. Further, fleets operating under the HetGAT Enc-Dec policy outperform other state-of-the-art graph neural network policies by achieving the highest fleet reward and fulfillment ratios in on-demand mobility networks.

CVAug 24, 2024
Explainable Concept Generation through Vision-Language Preference Learning for Understanding Neural Networks' Internal Representations

Aditya Taparia, Som Sagar, Ransalu Senanayake

Understanding the inner representation of a neural network helps users improve models. Concept-based methods have become a popular choice for explaining deep neural networks post-hoc because, unlike most other explainable AI techniques, they can be used to test high-level visual "concepts" that are not directly related to feature attributes. For instance, the concept of "stripes" is important to classify an image as a zebra. Concept-based explanation methods, however, require practitioners to guess and manually collect multiple candidate concept image sets, making the process labor-intensive and prone to overlooking important concepts. Addressing this limitation, in this paper, we frame concept image set creation as an image generation problem. However, since naively using a standard generative model does not result in meaningful concepts, we devise a reinforcement learning-based preference optimization (RLPO) algorithm that fine-tunes a vision-language generative model from approximate textual descriptions of concepts. Through a series of experiments, we demonstrate our method's ability to efficiently and reliably articulate diverse concepts that are otherwise challenging to craft manually.

LGJul 3, 2022
Renaissance Robot: Optimal Transport Policy Fusion for Learning Diverse Skills

Julia Tan, Ransalu Senanayake, Fabio Ramos

Deep reinforcement learning (RL) is a promising approach to solving complex robotics problems. However, the process of learning through trial-and-error interactions is often highly time-consuming, despite recent advancements in RL algorithms. Additionally, the success of RL is critically dependent on how well the reward-shaping function suits the task, which is also time-consuming to design. As agents trained on a variety of robotics problems continue to proliferate, the ability to reuse their valuable learning for new domains becomes increasingly significant. In this paper, we propose a post-hoc technique for policy fusion using Optimal Transport theory as a robust means of consolidating the knowledge of multiple agents that have been trained on distinct scenarios. We further demonstrate that this provides an improved weights initialisation of the neural network policy for learning new tasks, requiring less time and computational resources than either retraining the parent policies or training a new policy from scratch. Ultimately, our results on diverse agents commonly used in deep RL show that specialised knowledge can be unified into a "Renaissance agent", allowing for quicker learning of new skills.

ROApr 15
Humanoid Factors: Design Principles for AI Humanoids in Human Worlds

Xinyuan Liu, Eren Sadikoglu, Ransalu Senanayake et al.

Human factors research has long focused on optimizing environments, tools, and systems to account for human performance. Yet, as humanoid robots begin to share our workplaces, homes, and public spaces, the design challenge expands. We must now consider not only factors for humans but also factors for humanoids, since both will coexist and interact within the same environments. Unlike conventional machines, humanoids introduce expectations of human-like behavior, communication, and social presence, which reshape usability, trust, and safety considerations. In this article, we introduce the concept of humanoid factors as a framework structured around four pillars - physical, cognitive, social, and ethical - that shape the development of humanoids to help them effectively coexist and collaborate with humans. This framework characterizes the overlap and divergence between human capabilities and those of general-purpose humanoids powered by AI foundation models. To demonstrate our framework's practical utility, we then apply the framework to evaluate a real-world humanoid control algorithm, illustrating how conventional task completion metrics in robotics overlook key human cognitive and interaction principles. We thus position humanoid factors as a foundational framework for designing, evaluating, and governing sustained human-humanoid coexistence.

ROSep 16, 2024
BaTCAVe: Trustworthy Explanations for Robot Behaviors

Som Sagar, Aditya Taparia, Harsh Mankodiya et al.

Black box neural networks are an indispensable part of modern robots. Nevertheless, deploying such high-stakes systems in real-world scenarios poses significant challenges when the stakeholders, such as engineers and legislative bodies, lack insights into the neural networks' decision-making process. Presently, explainable AI is primarily tailored to natural language processing and computer vision, falling short in two critical aspects when applied in robots: grounding in decision-making tasks and the ability to assess trustworthiness of their explanations. In this paper, we introduce a trustworthy explainable robotics technique based on human-interpretable, high-level concepts that attribute to the decisions made by the neural network. Our proposed technique provides explanations with associated uncertainty scores for the explanation by matching neural network's activations with human-interpretable visualizations. To validate our approach, we conducted a series of experiments with various simulated and real-world robot decision-making models, demonstrating the effectiveness of the proposed approach as a post-hoc, human-friendly robot diagnostic tool.

LGJul 14, 2024
Towards Adapting Reinforcement Learning Agents to New Tasks: Insights from Q-Values

Ashwin Ramaswamy, Ransalu Senanayake

While contemporary reinforcement learning research and applications have embraced policy gradient methods as the panacea of solving learning problems, value-based methods can still be useful in many domains as long as we can wrangle with how to exploit them in a sample efficient way. In this paper, we explore the chaotic nature of DQNs in reinforcement learning, while understanding how the information that they retain when trained can be repurposed for adapting a model to different tasks. We start by designing a simple experiment in which we are able to observe the Q-values for each state and action in an environment. Then we train in eight different ways to explore how these training algorithms affect the way that accurate Q-values are learned (or not learned). We tested the adaptability of each trained model when retrained to accomplish a slightly modified task. We then scaled our setup to test the larger problem of an autonomous vehicle at an unprotected intersection. We observed that the model is able to adapt to new tasks quicker when the base model's Q-value estimates are closer to the true Q-values. The results provide some insights and guidelines into what algorithms are useful for sample efficient task adaptation.

AIMar 26
The Anatomy of Uncertainty in LLMs

Aditya Taparia, Ransalu Senanayake, Kowshik Thopalli et al.

Understanding why a large language model (LLM) is uncertain about the response is important for their reliable deployment. Current approaches, which either provide a single uncertainty score or rely on the classical aleatoric-epistemic dichotomy, fail to offer actionable insights for improving the generative model. Recent studies have also shown that such methods are not enough for understanding uncertainty in LLMs. In this work, we advocate for an uncertainty decomposition framework that dissects LLM uncertainty into three distinct semantic components: (i) input ambiguity, arising from ambiguous prompts; (ii) knowledge gaps, caused by insufficient parametric evidence; and (iii) decoding randomness, stemming from stochastic sampling. Through a series of experiments we demonstrate that the dominance of these components can shift across model size and task. Our framework provides a better understanding to audit LLM reliability and detect hallucinations, paving the way for targeted interventions and more trustworthy systems.

AIMar 13
LLM Routing as Reasoning: A MaxSAT View

Son Nguyen, Xinyuan Liu, Ransalu Senanayake

Routing a query through an appropriate LLM is challenging, particularly when user preferences are expressed in natural language and model attributes are only partially observable. We propose a constraint-based interpretation of language-conditioned LLM routing, formulating it as a weighted MaxSAT/MaxSMT problem in which natural language feedback induces hard and soft constraints over model attributes. Under this view, routing corresponds to selecting models that approximately maximize satisfaction of feedback-conditioned clauses. Empirical analysis on a 25-model benchmark shows that language feedback produces near-feasible recommendation sets, while no-feedback scenarios reveal systematic priors. Our results suggest that LLM routing can be understood as structured constraint optimization under language-conditioned preferences.

AIFeb 12
Learning to Configure Agentic AI Systems

Aditya Taparia, Som Sagar, Ransalu Senanayake

Configuring LLM-based agent systems involves choosing workflows, tools, token budgets, and prompts from a large combinatorial design space, and is typically handled today by fixed large templates or hand-tuned heuristics. This leads to brittle behavior and unnecessary compute, since the same cumbersome configuration is often applied to both easy and hard input queries. We formulate agent configuration as a query-wise decision problem and introduce ARC (Agentic Resource & Configuration learner), which learns a light-weight hierarchical policy using reinforcement learning to dynamically tailor these configurations. Across multiple benchmarks spanning reasoning and tool-augmented question answering, the learned policy consistently outperforms strong hand-designed and other baselines, achieving up to 25% higher task accuracy while also reducing token and runtime costs. These results demonstrate that learning per-query agent configurations is a powerful alternative to "one size fits all" designs.

AIApr 28
MEMOR-E: In-Context and Fine-Tuned LLM Personalization for Alzheimer's Assistive Robotics

Maissa Abir Smaili, Eren Sadikoglu, Ransalu Senanayake

Alzheimer's disease is a neurodegenerative disorder marked by progressive declines in memory and language that reduce independence in daily life, motivating socially assistive robotic support. This paper presents MEMOR-E, a mobile quadruped robot with an interactive tablet interface that assists patients and caregivers through medication reminders, routine guidance, memory oriented interactions, and companionship. We evaluated the feasibility of fine tuning large language models (LLMs) to emulate stage consistent cognitive behavior and interpret responses across standard neuropsychological language tasks, using audio transcriptions from 235 Alzheimer's patients and synthetically generated healthy controls. We also report findings on using in context learning (ICL) in LLMs, where a second LLM produced domain and severity level cognitive error summaries. Our results show that MEMOR-E can generate stage aware, non diagnostic cognitive summaries that support personalized assistive interactions, while explainable AI mechanisms translate model outputs into transparent, human readable evidence to enable caregiver oversight and trustworthy human robot interaction.

RODec 3, 2024
From Mystery to Mastery: Failure Diagnosis for Improving Manipulation Policies

Som Sagar, Jiafei Duan, Sreevishakh Vasudevan et al. · uw

Robot manipulation policies often fail for unknown reasons, posing significant challenges for real-world deployment. Researchers and engineers typically address these failures using heuristic approaches, which are not only labor-intensive and costly but also prone to overlooking critical failure modes (FMs). This paper introduces Robot Manipulation Diagnosis (RoboMD), a systematic framework designed to automatically identify FMs arising from unanticipated changes in the environment. Considering the vast space of potential FMs in a pre-trained manipulation policy, we leverage deep reinforcement learning (deep RL) to explore and uncover these FMs using a specially trained vision-language embedding that encodes a notion of failures. This approach enables users to probabilistically quantify and rank failures in previously unseen environmental conditions. Through extensive experiments across various manipulation tasks and algorithms, we demonstrate RoboMD's effectiveness in diagnosing unknown failures in unstructured environments, providing a systematic pathway to improve the robustness of manipulation policies.

CLNov 12, 2024
ExpressivityArena: Can LLMs Express Information Implicitly?

Joshua Tint, Som Sagar, Aditya Taparia et al.

While Large Language Models (LLMs) have demonstrated remarkable performance in certain dimensions, their ability to express implicit language cues that human use for effective communication remains unclear. This paper presents ExpressivityArena, a Python library for measuring the implicit communication abilities of LLMs. We provide a comprehensive framework to evaluate expressivity of arbitrary LLMs and explore its practical implications. To this end, we refine the definition and measurements of ``expressivity,'' and use our framework in a set of small experiments. These experiments test LLMs in creative and logical tasks such as poetry, coding, and emotion-based responses. They are then evaluated by an automated grader, through ExpressivityArena, which we verify to be the most pragmatic for testing expressivity. Building on these experiments, we deepen our understanding of the expressivity of LLMs by assessing their ability to remain expressive in conversations. Our findings indicate that LLMs are capable of generating and understanding expressive content, however, with some limitations. These insights will inform the future development and deployment of expressive LLMs. We provide the code for ExpressivityArena alongside our paper.

AIMay 25, 2025
Consistency-based Abductive Reasoning over Perceptual Errors of Multiple Pre-trained Models in Novel Environments

Mario Leiva, Noel Ngu, Joshua Shay Kricheli et al.

The deployment of pre-trained perception models in novel environments often leads to performance degradation due to distributional shifts. Although recent artificial intelligence approaches for metacognition use logical rules to characterize and filter model errors, improving precision often comes at the cost of reduced recall. This paper addresses the hypothesis that leveraging multiple pre-trained models can mitigate this recall reduction. We formulate the challenge of identifying and managing conflicting predictions from various models as a consistency-based abduction problem, building on the idea of abductive learning (ABL) but applying it to test-time instead of training. The input predictions and the learned error detection rules derived from each model are encoded in a logic program. We then seek an abductive explanation--a subset of model predictions--that maximizes prediction coverage while ensuring the rate of logical inconsistencies (derived from domain constraints) remains below a specified threshold. We propose two algorithms for this knowledge representation task: an exact method based on Integer Programming (IP) and an efficient Heuristic Search (HS). Through extensive experiments on a simulated aerial imagery dataset featuring controlled, complex distributional shifts, we demonstrate that our abduction-based framework outperforms individual models and standard ensemble baselines, achieving, for instance, average relative improvements of approximately 13.6\% in F1-score and 16.6\% in accuracy across 15 diverse test datasets when compared to the best individual model. Our results validate the use of consistency-based abduction as an effective mechanism to robustly integrate knowledge from multiple imperfect models in challenging, novel scenarios.

LGFeb 18, 2025
Multiple Distribution Shift -- Aerial (MDS-A): A Dataset for Test-Time Error Detection and Model Adaptation

Noel Ngu, Aditya Taparia, Gerardo I. Simari et al.

Machine learning models assume that training and test samples are drawn from the same distribution. As such, significant differences between training and test distributions often lead to degradations in performance. We introduce Multiple Distribution Shift -- Aerial (MDS-A) -- a collection of inter-related datasets of the same aerial domain that are perturbed in different ways to better characterize the effects of out-of-distribution performance. Specifically, MDS-A is a set of simulated aerial datasets collected under different weather conditions. We include six datasets under different simulated weather conditions along with six baseline object-detection models, as well as several test datasets that are a mix of weather conditions that we show have significant differences from the training data. In this paper, we present characterizations of MDS-A, provide performance results for the baseline machine learning models (on both their specific training datasets and the test data), as well as results of the baselines after employing recent knowledge-engineering error-detection techniques (EDR) thought to improve out-of-distribution performance. The dataset is available at https://lab-v2.github.io/mdsa-dataset-website.

LGDec 9, 2024
MoSH: Modeling Multi-Objective Tradeoffs with Soft and Hard Bounds

Edward Chen, Natalie Dullerud, Thomas Niedermayr et al.

Countless science and engineering applications in multi-objective optimization (MOO) necessitate that decision-makers (DMs) select a Pareto-optimal solution which aligns with their preferences. Evaluating individual solutions is often expensive, necessitating cost-sensitive optimization techniques. Due to competing objectives, the space of trade-offs is also expansive -- thus, examining the full Pareto frontier may prove overwhelming to a DM. Such real-world settings generally have loosely-defined and context-specific desirable regions for each objective function that can aid in constraining the search over the Pareto frontier. We introduce a novel conceptual framework that operationalizes these priors using soft-hard functions, SHFs, which allow for the DM to intuitively impose soft and hard bounds on each objective -- which has been lacking in previous MOO frameworks. Leveraging a novel minimax formulation for Pareto frontier sampling, we propose a two-step process for obtaining a compact set of Pareto-optimal points which respect the user-defined soft and hard bounds: (1) densely sample the Pareto frontier using Bayesian optimization, and (2) sparsify the selected set to surface to the user, using robust submodular function optimization. We prove that (2) obtains the optimal compact Pareto-optimal set of points from (1). We further show that many practical problems fit within the SHF framework and provide extensive empirical validation on diverse domains, including brachytherapy, engineering design, and large language model personalization. Specifically, for brachytherapy, our approach returns a compact set of points with over 3% greater SHF-defined utility than the next best approach. Among the other diverse experiments, our approach consistently leads in utility, allowing the DM to reach >99% of their maximum possible desired utility within validation of 5 points.

ROJun 30, 2025
PAC Bench: Do Foundation Models Understand Prerequisites for Executing Manipulation Policies?

Atharva Gundawar, Som Sagar, Ransalu Senanayake

Vision-Language Models (VLMs) are increasingly pivotal for generalist robot manipulation, enabling tasks such as physical reasoning, policy generation, and failure detection. However, their proficiency in these high-level applications often assumes a deep understanding of low-level physical prerequisites, a capability that remains largely unverified. For robots to perform actions reliably, they must comprehend intrinsic object properties (e.g., material, weight), action affordances (e.g., graspable, stackable), and physical constraints (e.g., stability, reachability, or an object's state, such as being closed). Despite the widespread use of VLMs in manipulation tasks, we argue that off-the-shelf models may lack this granular, physically grounded understanding, as such prerequisites are often overlooked during training. To address this critical gap, we introduce PAC Bench, a comprehensive benchmark designed to systematically evaluate VLMs on their understanding of core Properties, Affordances, and Constraints (PAC) from a task executability perspective. PAC Bench features a diverse dataset with over 30,000 annotations, comprising 673 real-world images (115 object classes, 15 property types, and 1 to 3 affordances defined per class), 100 real-world humanoid-view scenarios, and 120 unique simulated constraint scenarios across four tasks. Our evaluations reveal significant gaps in the ability of current VLMs to grasp fundamental physical concepts, highlighting limitations in their suitability for reliable robot manipulation and pointing to key areas for targeted research. PAC Bench also serves as a standardized benchmark for rigorously evaluating physical reasoning in VLMs and guiding the development of more robust, physically grounded models for robotic applications. Project Page: https://pacbench.github.io/

CVMay 19, 2025
VLC Fusion: Vision-Language Conditioned Sensor Fusion for Robust Object Detection

Aditya Taparia, Noel Ngu, Mario Leiva et al.

Although fusing multiple sensor modalities can enhance object detection performance, existing fusion approaches often overlook subtle variations in environmental conditions and sensor inputs. As a result, they struggle to adaptively weight each modality under such variations. To address this challenge, we introduce Vision-Language Conditioned Fusion (VLC Fusion), a novel fusion framework that leverages a Vision-Language Model (VLM) to condition the fusion process on nuanced environmental cues. By capturing high-level environmental context such as as darkness, rain, and camera blurring, the VLM guides the model to dynamically adjust modality weights based on the current scene. We evaluate VLC Fusion on real-world autonomous driving and military target detection datasets that include image, LIDAR, and mid-wave infrared modalities. Our experiments show that VLC Fusion consistently outperforms conventional fusion baselines, achieving improved detection accuracy in both seen and unseen scenarios.

LGOct 22, 2024
LLM-Assisted Red Teaming of Diffusion Models through "Failures Are Fated, But Can Be Faded"

Som Sagar, Aditya Taparia, Ransalu Senanayake

In large deep neural networks that seem to perform surprisingly well on many tasks, we also observe a few failures related to accuracy, social biases, and alignment with human values, among others. Therefore, before deploying these models, it is crucial to characterize this failure landscape for engineers to debug or audit models. Nevertheless, it is infeasible to exhaustively test for all possible combinations of factors that could lead to a model's failure. In this paper, we improve the "Failures are fated, but can be faded" framework (arXiv:2406.07145)--a post-hoc method to explore and construct the failure landscape in pre-trained generative models--with a variety of deep reinforcement learning algorithms, screening tests, and LLM-based rewards and state generation. With the aid of limited human feedback, we then demonstrate how to restructure the failure landscape to be more desirable by moving away from the discovered failure modes. We empirically demonstrate the effectiveness of the proposed method on diffusion models. We also highlight the strengths and weaknesses of each algorithm in identifying failure modes.

LGJun 11, 2024
Failures Are Fated, But Can Be Faded: Characterizing and Mitigating Unwanted Behaviors in Large-Scale Vision and Language Models

Som Sagar, Aditya Taparia, Ransalu Senanayake

In large deep neural networks that seem to perform surprisingly well on many tasks, we also observe a few failures related to accuracy, social biases, and alignment with human values, among others. Therefore, before deploying these models, it is crucial to characterize this failure landscape for engineers to debug and legislative bodies to audit models. Nevertheless, it is infeasible to exhaustively test for all possible combinations of factors that could lead to a model's failure. In this paper, we introduce a post-hoc method that utilizes \emph{deep reinforcement learning} to explore and construct the landscape of failure modes in pre-trained discriminative and generative models. With the aid of limited human feedback, we then demonstrate how to restructure the failure landscape to be more desirable by moving away from the discovered failure modes. We empirically show the effectiveness of the proposed method across common Computer Vision, Natural Language Processing, and Vision-Language tasks.

ROMay 6, 2024
The Role of Predictive Uncertainty and Diversity in Embodied AI and Robot Learning

Ransalu Senanayake

Uncertainty has long been a critical area of study in robotics, particularly when robots are equipped with analytical models. As we move towards the widespread use of deep neural networks in robots, which have demonstrated remarkable performance in research settings, understanding the nuances of uncertainty becomes crucial for their real-world deployment. This guide offers an overview of the importance of uncertainty and provides methods to quantify and evaluate it from an applications perspective.

RONov 8, 2021
CoCo Games: Graphical Game-Theoretic Swarm Control for Communication-Aware Coverage

Malintha Fernando, Ransalu Senanayake, Martin Swany

We propose a novel framework for real-time communication-aware coverage control in networked robot swarms. Our framework unifies the robot dynamics with network-level message-routing to reach consensus on swarm formations in the presence of communication uncertainties by leveraging local information. Specifically, we formulate the communication-aware coverage as a cooperative graphical game, and use variational inference to reach mixed strategy Nash equilibria of the stage games. We experimentally validate the proposed approach in a mobile ad-hoc wireless network scenario using teams of aerial vehicles and terrestrial user equipment (UE) operating over a large geographic region of interest. We show that our approach can provide wireless coverage to stationary and mobile UEs under realistic network conditions.

LGOct 27, 2021
Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models

Phil Chen, Masha Itkina, Ransalu Senanayake et al.

Many applications of generative models rely on the marginalization of their high-dimensional output probability distributions. Normalization functions that yield sparse probability distributions can make exact marginalization more computationally tractable. However, sparse normalization functions usually require alternative loss functions for training since the log-likelihood is undefined for sparse probability distributions. Furthermore, many sparse normalization functions often collapse the multimodality of distributions. In this work, we present $\textit{ev-softmax}$, a sparse normalization function that preserves the multimodality of probability distributions. We derive its properties, including its gradient in closed-form, and introduce a continuous family of approximations to $\textit{ev-softmax}$ that have full support and can be trained with probabilistic loss functions such as negative log-likelihood and Kullback-Leibler divergence. We evaluate our method on a variety of generative models, including variational autoencoders and auto-regressive architectures. Our method outperforms existing dense and sparse normalization techniques in distributional accuracy. We demonstrate that $\textit{ev-softmax}$ successfully reduces the dimensionality of probability distributions while maintaining multimodality.

ROAug 29, 2021
A Hybrid Rule-Based and Data-Driven Approach to Driver Modeling through Particle Filtering

Raunak Bhattacharyya, Soyeon Jung, Liam Kruse et al.

Autonomous vehicles need to model the behavior of surrounding human driven vehicles to be safe and efficient traffic participants. Existing approaches to modeling human driving behavior have relied on both data-driven and rule-based methods. While data-driven models are more expressive, rule-based models are interpretable, which is an important requirement for safety-critical domains like driving. However, rule-based models are not sufficiently representative of data, and data-driven models are yet unable to generate realistic traffic simulation due to unrealistic driving behavior such as collisions. In this paper, we propose a methodology that combines rule-based modeling with data-driven learning. While the rules are governed by interpretable parameters of the driver model, these parameters are learned online from driving demonstration data using particle filtering. We perform driver modeling experiments on the task of highway driving and merging using data from three real-world driving demonstration datasets. Our results show that driver models based on our hybrid rule-based and data-driven approach can accurately capture real-world driving behavior. Further, we assess the realism of the driving behavior generated by our model by having humans perform a driving Turing test, where they are asked to distinguish between videos of real driving and those generated using our driver models.

ROJul 23, 2021
3D Radar Velocity Maps for Uncertain Dynamic Environments

Ransalu Senanayake, Kyle Beltran Hatch, Jason Zheng et al.

Future urban transportation concepts include a mixture of ground and air vehicles with varying degrees of autonomy in a congested environment. In such dynamic environments, occupancy maps alone are not sufficient for safe path planning. Safe and efficient transportation requires reasoning about the 3D flow of traffic and properly modeling uncertainty. Several different approaches can be taken for developing 3D velocity maps. This paper explores a Bayesian approach that captures our uncertainty in the map given training data. The approach involves projecting spatial coordinates into a high-dimensional feature space and then applying Bayesian linear regression to make predictions and quantify uncertainty in our estimates. On a collection of air and ground datasets, we demonstrate that this approach is effective and more scalable than several alternative approaches.

CVJul 10, 2021
Out of Distribution Detection and Adversarial Attacks on Deep Neural Networks for Robust Medical Image Analysis

Anisie Uwimana1, Ransalu Senanayake

Deep learning models have become a popular choice for medical image analysis. However, the poor generalization performance of deep learning models limits them from being deployed in the real world as robustness is critical for medical applications. For instance, the state-of-the-art Convolutional Neural Networks (CNNs) fail to detect adversarial samples or samples drawn statistically far away from the training distribution. In this work, we experimentally evaluate the robustness of a Mahalanobis distance-based confidence score, a simple yet effective method for detecting abnormal input samples, in classifying malaria parasitized cells and uninfected cells. Results indicated that the Mahalanobis confidence score detector exhibits improved performance and robustness of deep learning models, and achieves stateof-the-art performance on both out-of-distribution (OOD) and adversarial samples.

RONov 18, 2020
Double-Prong ConvLSTM for Spatiotemporal Occupancy Prediction in Dynamic Environments

Maneekwan Toyungyernsub, Masha Itkina, Ransalu Senanayake et al.

Predicting the future occupancy state of an environment is important to enable informed decisions for autonomous vehicles. Common challenges in occupancy prediction include vanishing dynamic objects and blurred predictions, especially for long prediction horizons. In this work, we propose a double-prong neural network architecture to predict the spatiotemporal evolution of the occupancy state. One prong is dedicated to predicting how the static environment will be observed by the moving ego vehicle. The other prong predicts how the dynamic objects in the environment will move. Experiments conducted on the real-world Waymo Open Dataset indicate that the fused output of the two prongs is capable of retaining dynamic objects and reducing blurriness in the predictions for longer time horizons than baseline models.

CVNov 3, 2020
Out-of-Distribution Detection for Automotive Perception

Julia Nitsch, Masha Itkina, Ransalu Senanayake et al.

Neural networks (NNs) are widely used for object classification in autonomous driving. However, NNs can fail on input data not well represented by the training dataset, known as out-of-distribution (OOD) data. A mechanism to detect OOD samples is important for safety-critical applications, such as automotive perception, to trigger a safe fallback mode. NNs often rely on softmax normalization for confidence estimation, which can lead to high confidences being assigned to OOD samples, thus hindering the detection of failures. This paper presents a method for determining whether inputs are OOD, which does not require OOD data during training and does not increase the computational cost of inference. The latter property is especially important in automotive applications with limited computational resources and real-time constraints. Our proposed approach outperforms state-of-the-art methods on real-world automotive datasets.

LGOct 19, 2020
Evidential Sparsification of Multimodal Latent Spaces in Conditional Variational Autoencoders

Masha Itkina, Boris Ivanovic, Ransalu Senanayake et al.

Discrete latent spaces in variational autoencoders have been shown to effectively capture the data distribution for many real-world problems such as natural language understanding, human intent prediction, and visual scene representation. However, discrete latent spaces need to be sufficiently large to capture the complexities of real-world data, rendering downstream tasks computationally challenging. For instance, performing motion planning in a high-dimensional latent representation of the environment could be intractable. We consider the problem of sparsifying the discrete latent space of a trained conditional variational autoencoder, while preserving its learned multimodality. As a post hoc latent space reduction technique, we use evidential theory to identify the latent classes that receive direct evidence from a particular input condition and filter out those that do not. Experiments on diverse tasks, such as image generation and human behavior prediction, demonstrate the effectiveness of our proposed technique at reducing the discrete latent sample space size of a model while maintaining its learned multimodality.

CVJul 1, 2020
Online Domain Adaptation for Occupancy Mapping

Anthony Tompkins, Ransalu Senanayake, Fabio Ramos

Creating accurate spatial representations that take into account uncertainty is critical for autonomous robots to safely navigate in unstructured environments. Although recent LIDAR based mapping techniques can produce robust occupancy maps, learning the parameters of such models demand considerable computational time, discouraging them from being used in real-time and large-scale applications such as autonomous driving. Recognizing the fact that real-world structures exhibit similar geometric features across a variety of urban environments, in this paper, we argue that it is redundant to learn all geometry dependent parameters from scratch. Instead, we propose a theoretical framework building upon the theory of optimal transport to adapt model parameters to account for changes in the environment, significantly amortizing the training cost. Further, with the use of high-fidelity driving simulators and real-world datasets, we demonstrate how parameters of 2D and 3D occupancy maps can be automatically adapted to accord with local spatial changes. We validate various domain adaptation paradigms through a series of experiments, ranging from inter-domain feature transfer to simulation-to-real-world feature transfer. Experiments verified the possibility of estimating parameters with a negligible computational and memory cost, enabling large-scale probabilistic mapping in urban environments.

ROJul 1, 2020
Directional Primitives for Uncertainty-Aware Motion Estimation in Urban Environments

Ransalu Senanayake, Maneekwan Toyungyernsub, Mingyu Wang et al.

We can use driving data collected over a long period of time to extract rich information about how vehicles behave in different areas of the roads. In this paper, we introduce the concept of directional primitives, which is a representation of prior information of road networks. Specifically, we represent the uncertainty of directions using a mixture of von Mises distributions and associated speeds using gamma distributions. These location-dependent primitives can be combined with motion information of surrounding vehicles to predict their future behavior in the form of probability distributions. Experiments conducted on highways, intersections, and roundabouts in the Carla simulator, as well as real-world urban driving datasets, indicate that primitives lead to better uncertainty-aware motion estimation.

AIJun 10, 2020
Modeling Human Driving Behavior through Generative Adversarial Imitation Learning

Raunak Bhattacharyya, Blake Wulfe, Derek Phillips et al.

An open problem in autonomous vehicle safety validation is building reliable models of human driving behavior in simulation. This work presents an approach to learn neural driving policies from real world driving demonstration data. We model human driving as a sequential decision making problem that is characterized by non-linearity and stochasticity, and unknown underlying cost functions. Imitation learning is an approach for generating intelligent behavior when the cost function is unknown or difficult to specify. Building upon work in inverse reinforcement learning (IRL), Generative Adversarial Imitation Learning (GAIL) aims to provide effective imitation even for problems with large or continuous state and action spaces, such as modeling human driving. This article describes the use of GAIL for learning-based driver modeling. Because driver modeling is inherently a multi-agent problem, where the interaction between agents needs to be modeled, this paper describes a parameter-sharing extension of GAIL called PS-GAIL to tackle multi-agent driver modeling. In addition, GAIL is domain agnostic, making it difficult to encode specific knowledge relevant to driving in the learning process. This paper describes Reward Augmented Imitation Learning (RAIL), which modifies the reward signal to provide domain-specific knowledge to the agent. Finally, human demonstrations are dependent upon latent factors that may not be captured by GAIL. This paper describes Burn-InfoGAIL, which allows for disentanglement of latent variability in demonstrations. Imitation learning experiments are performed using NGSIM, a real-world highway driving dataset. Experiments show that these modifications to GAIL can successfully model highway driving behavior, accurately replicating human demonstrations and generating realistic, emergent behavior in the traffic flow arising from the interaction between driving agents.

AIMay 6, 2020
Online Parameter Estimation for Human Driver Behavior Prediction

Raunak Bhattacharyya, Ransalu Senanayake, Kyle Brown et al.

Driver models are invaluable for planning in autonomous vehicles as well as validating their safety in simulation. Highly parameterized black-box driver models are very expressive, and can capture nuanced behavior. However, they usually lack interpretability and sometimes exhibit unrealistic-even dangerous-behavior. Rule-based models are interpretable, and can be designed to guarantee "safe" behavior, but are less expressive due to their low number of parameters. In this article, we show that online parameter estimation applied to the Intelligent Driver Model captures nuanced individual driving behavior while providing collision free trajectories. We solve the online parameter estimation problem using particle filtering, and benchmark performance against rule-based and black-box driver models on two real world driving data sets. We evaluate the closeness of our driver model to ground truth data demonstration and also assess the safety of the resulting emergent driving behavior.

RODec 4, 2019
Dynamic Hilbert Maps: Real-Time Occupancy Predictions in Changing Environment

Vitor Guizilini, Ransalu Senanayake, Fabio Ramos

This paper addresses the problem of learning instantaneous occupancy levels of dynamic environments and predicting future occupancy levels. Due to the complexity of most real-world environments, such as urban streets or crowded areas, the efficient and robust incorporation of temporal dependencies into otherwise static occupancy models remains a challenge. We propose a method to capture the spatial uncertainty of moving objects and incorporate this uncertainty information into a continuous occupancy map represented in a rich high-dimensional feature space. Experiments performed using LIDAR data verified the real-time performance of the algorithm.

ROSep 3, 2018
Directional grid maps: modeling multimodal angular uncertainty in dynamic environments

Ransalu Senanayake, Fabio Ramos

Robots often have to deal with the challenges of operating in dynamic and sometimes unpredictable environments. Although an occupancy map of the environment is sufficient for navigation of a mobile robot or manipulation tasks with a robotic arm in static environments, robots operating in dynamic environments demand richer information to improve robustness, efficiency, and safety. For instance, in path planning, it is important to know the direction of motion of dynamic objects at various locations of the environment for safer navigation or human-robot interaction. In this paper, we introduce directional statistics into robotic mapping to model circular data. Primarily, in collateral to occupancy grid maps, we propose directional grid maps to represent the location-wide long-term angular motion of the environment. Being highly representative, this defines a probability measure-field over the longitude-latitude space rather than a scalar-field or a vector field. Withal, we further demonstrate how the same theory can be used to model angular variations in the spatial domain, temporal domain, and spatiotemporal domain. We carried out a series of experiments to validate the proposed models using a variety of robots having different sensors such as RGB cameras and LiDARs on simulated and real-world settings in both indoor and outdoor environments.