HCJun 3
What Can Eye Gaze Teach Us About Real-World Cycling? Insights From the Oxford RobotCycle ProjectBenjamin Hardin, Efimia Panagiotaki, Daniele De Martini et al.
Although much is known about the physical danger of cycling situations, less is understood about the perceived danger of cycling. Furthermore, perception of danger may be filtered at a subconscious level and therefore difficult for one to self-report. To this end, these subconscious perceptions can be revealed through physiological metrics such as eye gaze. This paper explores the perceived safety of cycling in Oxford, United Kingdom and explores the ability of wearable eye tracking glasses to produce insights about the differences in perception under different environments and events. This paper finds that eye gaze patterns change between using bike lanes, car lanes and shared bus lanes, representing different cognitive challenges of each lane type. This paper presents that different intersections have significantly different eye gaze patterns which may have implications for cyclist stress. Finally, eye gaze patterns differ in the presence of events such as passes and pedestrians in the road compared to when cycling with no events. This paper draws conclusions on the benefits and limitations of using wearable eye trackers to estimate stress and cyclist workload.
ROApr 13, 2023
CAR-DESPOT: Causally-Informed Online POMDP Planning for Robots in Confounded EnvironmentsRicardo Cannizzaro, Lars Kunze · oxford
Robots operating in real-world environments must reason about possible outcomes of stochastic actions and make decisions based on partial observations of the true world state. A major challenge for making accurate and robust action predictions is the problem of confounding, which if left untreated can lead to prediction errors. The partially observable Markov decision process (POMDP) is a widely-used framework to model these stochastic and partially-observable decision-making problems. However, due to a lack of explicit causal semantics, POMDP planning methods are prone to confounding bias and thus in the presence of unobserved confounders may produce underperforming policies. This paper presents a novel causally-informed extension of "anytime regularized determinized sparse partially observable tree" (AR-DESPOT), a modern anytime online POMDP planner, using causal modelling and inference to eliminate errors caused by unmeasured confounder variables. We further propose a method to learn offline the partial parameterisation of the causal model for planning, from ground truth model data. We evaluate our methods on a toy problem with an unobserved confounder and show that the learned causal model is highly accurate, while our planning method is more robust to confounding and produces overall higher performing policies than AR-DESPOT.
ROJul 2, 2023
Effects of Explanation Specificity on Passengers in Autonomous DrivingDaniel Omeiza, Raunak Bhattacharyya, Nick Hawes et al. · oxford
The nature of explanations provided by an explainable AI algorithm has been a topic of interest in the explainable AI and human-computer interaction community. In this paper, we investigate the effects of natural language explanations' specificity on passengers in autonomous driving. We extended an existing data-driven tree-based explainer algorithm by adding a rule-based option for explanation generation. We generated auditory natural language explanations with different levels of specificity (abstract and specific) and tested these explanations in a within-subject user study (N=39) using an immersive physical driving simulation setup. Our results showed that both abstract and specific explanations had similar positive effects on passengers' perceived safety and the feeling of anxiety. However, the specific explanations influenced the desire of passengers to takeover driving control from the autonomous vehicle (AV), while the abstract explanations did not. We conclude that natural language auditory explanations are useful for passengers in autonomous driving, and their specificity levels could influence how much in-vehicle participants would wish to be in control of the driving activity.
ROAug 11, 2023
Towards a Causal Probabilistic Framework for Prediction, Action-Selection & Explanations for Robot Block-Stacking TasksRicardo Cannizzaro, Jonathan Routley, Lars Kunze · oxford
Uncertainties in the real world mean that is impossible for system designers to anticipate and explicitly design for all scenarios that a robot might encounter. Thus, robots designed like this are fragile and fail outside of highly-controlled environments. Causal models provide a principled framework to encode formal knowledge of the causal relationships that govern the robot's interaction with its environment, in addition to probabilistic representations of noise and uncertainty typically encountered by real-world robots. Combined with causal inference, these models permit an autonomous agent to understand, reason about, and explain its environment. In this work, we focus on the problem of a robot block-stacking task due to the fundamental perception and manipulation capabilities it demonstrates, required by many applications including warehouse logistics and domestic human support robotics. We propose a novel causal probabilistic framework to embed a physics simulation capability into a structural causal model to permit robots to perceive and assess the current state of a block-stacking task, reason about the next-best action from placement candidates, and generate post-hoc counterfactual explanations. We provide exemplar next-best action selection results and outline planned experimentation in simulated and real-world robot block-stacking tasks.
AIApr 19, 2022
From Spoken Thoughts to Automated Driving Commentary: Predicting and Explaining Intelligent Vehicles' ActionsDaniel Omeiza, Sule Anjomshoae, Helena Webb et al.
In commentary driving, drivers verbalise their observations, assessments and intentions. By speaking out their thoughts, both learning and expert drivers are able to create a better understanding and awareness of their surroundings. In the intelligent vehicle context, automated driving commentary can provide intelligible explanations about driving actions, thereby assisting a driver or an end-user during driving operations in challenging and safety-critical scenarios. In this paper, we conducted a field study in which we deployed a research vehicle in an urban environment to obtain data. While collecting sensor data of the vehicle's surroundings, we obtained driving commentary from a driving instructor using the think-aloud protocol. We analysed the driving commentary and uncovered an explanation style; the driver first announces his observations, announces his plans, and then makes general remarks. He also makes counterfactual comments. We successfully demonstrated how factual and counterfactual natural language explanations that follow this style could be automatically generated using a transparent tree-based approach. Generated explanations for longitudinal actions (e.g., stop and move) were deemed more intelligible and plausible by human judges compared to lateral actions, such as lane changes. We discussed how our approach can be built on in the future to realise more robust and effective explainability for driver assistance as well as partial and conditional automation of driving functions.
ROAug 16, 2024Code
S-RAF: A Simulation-Based Robustness Assessment Framework for Responsible Autonomous DrivingDaniel Omeiza, Pratik Somaiya, Jo-Ann Pattinson et al.
As artificial intelligence (AI) technology advances, ensuring the robustness and safety of AI-driven systems has become paramount. However, varying perceptions of robustness among AI developers create misaligned evaluation metrics, complicating the assessment and certification of safety-critical and complex AI systems such as autonomous driving (AD) agents. To address this challenge, we introduce Simulation-Based Robustness Assessment Framework (S-RAF) for autonomous driving. S-RAF leverages the CARLA Driving simulator to rigorously assess AD agents across diverse conditions, including faulty sensors, environmental changes, and complex traffic situations. By quantifying robustness and its relationship with other safety-critical factors, such as carbon emissions, S-RAF aids developers and stakeholders in building safe and responsible driving agents, and streamlining safety certification processes. Furthermore, S-RAF offers significant advantages, such as reduced testing costs, and the ability to explore edge cases that may be unsafe to test in the real world. The code for this framework is available here: https://github.com/cognitive-robots/rai-leaderboard
ROAug 19, 2023
Towards Probabilistic Causal Discovery, Inference & Explanations for Autonomous Drones in Mine Surveying TasksRicardo Cannizzaro, Rhys Howard, Paulina Lewinska et al. · oxford
Causal modelling offers great potential to provide autonomous agents the ability to understand the data-generation process that governs their interactions with the world. Such models capture formal knowledge as well as probabilistic representations of noise and uncertainty typically encountered by autonomous robots in real-world environments. Thus, causality can aid autonomous agents in making decisions and explaining outcomes, but deploying causality in such a manner introduces new challenges. Here we identify challenges relating to causality in the context of a drone system operating in a salt mine. Such environments are challenging for autonomous agents because of the presence of confounders, non-stationarity, and a difficulty in building complete causal models ahead of time. To address these issues, we propose a probabilistic causal framework consisting of: causally-informed POMDP planning, online SCM adaptation, and post-hoc counterfactual explanations. Further, we outline planned experimentation to evaluate the framework integrated with a drone system in simulated mine environments and on a real-world mine dataset.
CVFeb 7, 2023
Explainable Action Prediction through Self-Supervision on Scene GraphsPawit Kochakarn, Daniele De Martini, Daniel Omeiza et al.
This work explores scene graphs as a distilled representation of high-level information for autonomous driving, applied to future driver-action prediction. Given the scarcity and strong imbalance of data samples, we propose a self-supervision pipeline to infer representative and well-separated embeddings. Key aspects are interpretability and explainability; as such, we embed in our architecture attention mechanisms that can create spatial and temporal heatmaps on the scene graphs. We evaluate our system on the ROAD dataset against a fully-supervised approach, showing the superiority of our training regime.
LGAug 8, 2023
Semantic Interpretation and Validation of Graph Attention-based Explanations for GNN ModelsEfimia Panagiotaki, Daniele De Martini, Lars Kunze
In this work, we propose a methodology for investigating the use of semantic attention to enhance the explainability of Graph Neural Network (GNN)-based models. Graph Deep Learning (GDL) has emerged as a promising field for tasks like scene interpretation, leveraging flexible graph structures to concisely describe complex features and relationships. As traditional explainability methods used in eXplainable AI (XAI) cannot be directly applied to such structures, graph-specific approaches are introduced. Attention has been previously employed to estimate the importance of input features in GDL, however, the fidelity of this method in generating accurate and consistent explanations has been questioned. To evaluate the validity of using attention weights as feature importance indicators, we introduce semantically-informed perturbations and correlate predicted attention weights with the accuracy of the model. Our work extends existing attention-based graph explainability methods by analysing the divergence in the attention distributions in relation to semantically sorted feature sets and the behaviour of a GNN model, efficiently estimating feature importance. We apply our methodology on a lidar pointcloud estimation model successfully identifying key semantic classes that contribute to enhanced performance, effectively generating reliable post-hoc semantic explanations.
ROAug 7, 2023
SEM-GAT: Explainable Semantic Pose Estimation using Learned Graph AttentionEfimia Panagiotaki, Daniele De Martini, Georgi Pramatarov et al.
This paper proposes a Graph Neural Network(GNN)-based method for exploiting semantics and local geometry to guide the identification of reliable pointcloud registration candidates. Semantic and morphological features of the environment serve as key reference points for registration, enabling accurate lidar-based pose estimation. Our novel lightweight static graph structure informs our attention-based node aggregation network by identifying semantic-instance relationships, acting as an inductive bias to significantly reduce the computational burden of pointcloud registration. By connecting candidate nodes and exploiting cross-graph attention, we identify confidence scores for all potential registration correspondences and estimate the displacement between pointcloud scans. Our pipeline enables introspective analysis of the model's performance by correlating it with the individual contributions of local structures in the environment, providing valuable insights into the system's behaviour. We test our method on the KITTI odometry dataset, achieving competitive accuracy compared to benchmark methods and a higher track smoothness while relying on significantly fewer network parameters.
LGMay 8Code
Quantile-Coupled Flow Matching for Distributional Reinforcement LearningMichael Groom, Victor-Alexandru Darvariu, Lars Kunze et al.
Unlike standard expected-return Reinforcement Learning (RL), Distributional RL (DRL) models the full return distribution, making it better-suited for uncertainty-aware and risk-sensitive decision-making. Conditional Flow Matching (CFM) critics have recently attracted attention for modelling continuous, multi-modal return distributions. Despite this interest, there remains a substantial metric mismatch: DRL theory relies on the distributional Bellman operator being contractive in the $p$-Wasserstein distance, yet existing CFM critics are trained with arbitrary source-target couplings, so their flow-matching losses are not Wasserstein-aligned surrogates for matching Bellman target return distributions. In this work, we address this mismatch by proposing FlowIQN, a CFM critic that sorts source and Bellman target samples within each mini-batch to approximate the monotone optimal transport coupling, replacing arbitrary pairings with quantile-aligned flow paths. We prove that the loss of our quantile-coupled CFM critic yields a Wasserstein-aligned approximate projection compatible with the foundations of DRL. To our knowledge, FlowIQN is the first flow-matching distributional critic with an explicit Wasserstein-aligned projection guarantee. We further extend FlowIQN with shortcut models for efficient inference. Empirical results show that FlowIQN improves Wasserstein return-distribution accuracy over other CFM critics. It also yields competitive performance on offline RL benchmarks across multiple policy extraction methods, providing a theoretically grounded CFM critic that is readily compatible with DRL pipelines. Code: https://github.com/ori-goals/flowIQN.
ROSep 18, 2023
CC-SGG: Corner Case Scenario Generation using Learned Scene GraphsGeorge Drayson, Efimia Panagiotaki, Daniel Omeiza et al.
Corner case scenarios are an essential tool for testing and validating the safety of autonomous vehicles (AVs). As these scenarios are often insufficiently present in naturalistic driving datasets, augmenting the data with synthetic corner cases greatly enhances the safe operation of AVs in unique situations. However, the generation of synthetic, yet realistic, corner cases poses a significant challenge. In this work, we introduce a novel approach based on Heterogeneous Graph Neural Networks (HGNNs) to transform regular driving scenarios into corner cases. To achieve this, we first generate concise representations of regular driving scenes as scene graphs, minimally manipulating their structure and properties. Our model then learns to perturb those graphs to generate corner cases using attention and triple embeddings. The input and perturbed graphs are then imported back into the simulation to generate corner case scenarios. Our model successfully learned to produce corner cases from input scene graphs, achieving 89.9% prediction accuracy on our testing dataset. We further validate the generated scenarios on baseline autonomous driving methods, demonstrating our model's ability to effectively create critical situations for the baselines.
CLApr 12, 2023
Textual Explanations for Automated Commentary DrivingMarc Alexander Kühn, Daniel Omeiza, Lars Kunze
The provision of natural language explanations for the predictions of deep-learning-based vehicle controllers is critical as it enhances transparency and easy audit. In this work, a state-of-the-art (SOTA) prediction and explanation model is thoroughly evaluated and validated (as a benchmark) on the new Sense--Assess--eXplain (SAX). Additionally, we developed a new explainer model that improved over the baseline architecture in two ways: (i) an integration of part of speech prediction and (ii) an introduction of special token penalties. On the BLEU metric, our explanation generation technique outperformed SOTA by a factor of 7.7 when applied on the BDD-X dataset. The description generation technique is also improved by a factor of 1.3. Hence, our work contributes to the realisation of future explainable autonomous vehicles.
AIJan 31, 2023
Evaluating Temporal Observation-Based Causal Discovery Techniques Applied to Road Driver BehaviourRhys Howard, Lars Kunze
Autonomous robots are required to reason about the behaviour of dynamic agents in their environment. The creation of models to describe these relationships is typically accomplished through the application of causal discovery techniques. However, as it stands observational causal discovery techniques struggle to adequately cope with conditions such as causal sparsity and non-stationarity typically seen during online usage in autonomous agent domains. Meanwhile, interventional techniques are not always feasible due to domain restrictions. In order to better explore the issues facing observational techniques and promote further discussion of these topics we carry out a benchmark across 10 contemporary observational temporal causal discovery methods in the domain of autonomous driving. By evaluating these methods upon causal scenes drawn from real world datasets in addition to those generated synthetically we highlight where improvements need to be made in order to facilitate the application of causal discovery techniques to the aforementioned use-cases. Finally, we discuss potential directions for future work that could help better tackle the difficulties currently experienced by state of the art techniques.
ROJun 6, 2023
Simulation-Based Counterfactual Causal Discovery on Real World Driver BehaviourRhys Howard, Lars Kunze
Being able to reason about how one's behaviour can affect the behaviour of others is a core skill required of intelligent driving agents. Despite this, the state of the art struggles to meet the need of agents to discover causal links between themselves and others. Observational approaches struggle because of the non-stationarity of causal links in dynamic environments, and the sparsity of causal interactions while requiring the approaches to work in an online fashion. Meanwhile interventional approaches are impractical as a vehicle cannot experiment with its actions on a public road. To counter the issue of non-stationarity we reformulate the problem in terms of extracted events, while the previously mentioned restriction upon interventions can be overcome with the use of counterfactual simulation. We present three variants of the proposed counterfactual causal discovery method and evaluate these against state of the art observational temporal causal discovery methods across 3396 causal scenes extracted from a real world driving dataset. We find that the proposed method significantly outperforms the state of the art on the proposed task quantitatively and can offer additional insights by comparing the outcome of an alternate series of decisions in a way that observational and interventional approaches cannot.
IVOct 4, 2023
LROC-PANGU-GAN: Closing the Simulation Gap in Learning Crater Segmentation with Planetary SimulatorsJaewon La, Jaime Phadke, Matt Hutton et al.
It is critical for probes landing on foreign planetary bodies to be able to robustly identify and avoid hazards - as, for example, steep cliffs or deep craters can pose significant risks to a probe's landing and operational success. Recent applications of deep learning to this problem show promising results. These models are, however, often learned with explicit supervision over annotated datasets. These human-labelled crater databases, such as from the Lunar Reconnaissance Orbiter Camera (LROC), may lack in consistency and quality, undermining model performance - as incomplete and/or inaccurate labels introduce noise into the supervisory signal, which encourages the model to learn incorrect associations and results in the model making unreliable predictions. Physics-based simulators, such as the Planet and Asteroid Natural Scene Generation Utility, have, in contrast, perfect ground truth, as the internal state that they use to render scenes is known with exactness. However, they introduce a serious simulation-to-real domain gap - because of fundamental differences between the simulated environment and the real-world arising from modelling assumptions, unaccounted for physical interactions, environmental variability, etc. Therefore, models trained on their outputs suffer when deployed in the face of realism they have not encountered in their training data distributions. In this paper, we therefore introduce a system to close this "realism" gap while retaining label fidelity. We train a CycleGAN model to synthesise LROC from Planet and Asteroid Natural Scene Generation Utility (PANGU) images. We show that these improve the training of a downstream crater segmentation network, with segmentation performance on a test set of real LROC images improved as compared to using only simulated PANGU images.
HCAug 16, 2024
A Transparency Paradox? Investigating the Impact of Explanation Specificity and Autonomous Vehicle Perceptual Inaccuracies on PassengersDaniel Omeiza, Raunak Bhattacharyya, Marina Jirotka et al.
Transparency in automated systems could be afforded through the provision of intelligible explanations. While transparency is desirable, might it lead to catastrophic outcomes (such as anxiety), that could outweigh its benefits? It's quite unclear how the specificity of explanations (level of transparency) influences recipients, especially in autonomous driving (AD). In this work, we examined the effects of transparency mediated through varying levels of explanation specificity in AD. We first extended a data-driven explainer model by adding a rule-based option for explanation generation in AD, and then conducted a within-subject lab study with 39 participants in an immersive driving simulator to study the effect of the resulting explanations. Specifically, our investigation focused on: (1) how different types of explanations (specific vs. abstract) affect passengers' perceived safety, anxiety, and willingness to take control of the vehicle when the vehicle perception system makes erroneous predictions; and (2) the relationship between passengers' behavioural cues and their feelings during the autonomous drives. Our findings showed that passengers felt safer with specific explanations when the vehicle's perception system had minimal errors, while abstract explanations that hid perception errors led to lower feelings of safety. Anxiety levels increased when specific explanations revealed perception system errors (high transparency). We found no significant link between passengers' visual patterns and their anxiety levels. Our study suggests that passengers prefer clear and specific explanations (high transparency) when they originate from autonomous vehicles (AVs) with optimal perceptual accuracy.
AIAug 25, 2023
Generating and Explaining Corner Cases Using Learnt Probabilistic Lane GraphsEnrik Maci, Rhys Howard, Lars Kunze
Validating the safety of Autonomous Vehicles (AVs) operating in open-ended, dynamic environments is challenging as vehicles will eventually encounter safety-critical situations for which there is not representative training data. By increasing the coverage of different road and traffic conditions and by including corner cases in simulation-based scenario testing, the safety of AVs can be improved. However, the creation of corner case scenarios including multiple agents is non-trivial. Our approach allows engineers to generate novel, realistic corner cases based on historic traffic data and to explain why situations were safety-critical. In this paper, we introduce Probabilistic Lane Graphs (PLGs) to describe a finite set of lane positions and directions in which vehicles might travel. The structure of PLGs is learnt directly from spatio-temporal traffic data. The graph model represents the actions of the drivers in response to a given state in the form of a probabilistic policy. We use reinforcement learning techniques to modify this policy and to generate realistic and explainable corner case scenarios which can be used for assessing the safety of AVs.
AIJul 30, 2024
From Feature Importance to Natural Language Explanations Using LLMs with RAGSule Tekkesinoglu, Lars Kunze
As machine learning becomes increasingly integral to autonomous decision-making processes involving human interaction, the necessity of comprehending the model's outputs through conversational means increases. Most recently, foundation models are being explored for their potential as post hoc explainers, providing a pathway to elucidate the decision-making mechanisms of predictive models. In this work, we introduce traceable question-answering, leveraging an external knowledge repository to inform the responses of Large Language Models (LLMs) to user queries within a scene understanding task. This knowledge repository comprises contextual details regarding the model's output, containing high-level features, feature importance, and alternative probabilities. We employ subtractive counterfactual reasoning to compute feature importance, a method that entails analysing output variations resulting from decomposing semantic features. Furthermore, to maintain a seamless conversational flow, we integrate four key characteristics - social, causal, selective, and contrastive - drawn from social science research on human explanations into a single-shot prompt, guiding the response generation process. Our evaluation demonstrates that explanations generated by the LLMs encompassed these elements, indicating its potential to bridge the gap between complex model outputs and natural language expressions.
LGSep 17, 2025Code
Ensemble of Pre-Trained Models for Long-Tailed Trajectory PredictionDivya Thuremella, Yi Yang, Simon Wanna et al.
This work explores the application of ensemble modeling to the multidimensional regression problem of trajectory prediction for vehicles in urban environments. As newer and bigger state-of-the-art prediction models for autonomous driving continue to emerge, an important open challenge is the problem of how to combine the strengths of these big models without the need for costly re-training. We show how, perhaps surprisingly, combining state-of-the-art deep learning models out-of-the-box (without retraining or fine-tuning) with a simple confidence-weighted average method can enhance the overall prediction. Indeed, while combining trajectory prediction models is not straightforward, this simple approach enhances performance by 10% over the best prediction model, especially in the long-tailed metrics. We show that this performance improvement holds on both the NuScenes and Argoverse datasets, and that these improvements are made across the dataset distribution. The code for our work is open source.
ROFeb 16, 2024
RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language ModelJianhao Yuan, Shuyang Sun, Daniel Omeiza et al.
We need to trust robots that use often opaque AI methods. They need to explain themselves to us, and we need to trust their explanation. In this regard, explainability plays a critical role in trustworthy autonomous decision-making to foster transparency and acceptance among end users, especially in complex autonomous driving. Recent advancements in Multi-Modal Large Language models (MLLMs) have shown promising potential in enhancing the explainability as a driving agent by producing control predictions along with natural language explanations. However, severe data scarcity due to expensive annotation costs and significant domain gaps between different datasets makes the development of a robust and generalisable system an extremely challenging task. Moreover, the prohibitively expensive training requirements of MLLM and the unsolved problem of catastrophic forgetting further limit their generalisability post-deployment. To address these challenges, we present RAG-Driver, a novel retrieval-augmented multi-modal large language model that leverages in-context learning for high-performance, explainable, and generalisable autonomous driving. By grounding in retrieved expert demonstration, we empirically validate that RAG-Driver achieves state-of-the-art performance in producing driving action explanations, justifications, and control signal prediction. More importantly, it exhibits exceptional zero-shot generalisation capabilities to unseen environments without further training endeavours.
ROJul 15, 2024
Risk-aware Trajectory Prediction by Incorporating Spatio-temporal Traffic Interaction AnalysisDivya Thuremella, Lewis Ince, Lars Kunze
To operate in open-ended environments where humans interact in complex, diverse ways, autonomous robots must learn to predict their behaviour, especially when that behavior is potentially dangerous to other agents or to the robot. However, reducing the risk of accidents requires prior knowledge of where potential collisions may occur and how. Therefore, we propose to gain this information by analyzing locations and speeds that commonly correspond to high-risk interactions within the dataset, and use it within training to generate better predictions in high risk situations. Through these location-based and speed-based re-weighting techniques, we achieve improved overall performance, as measured by most-likely FDE and KDE, as well as improved performance on high-speed vehicles, and vehicles within high-risk locations. 2023 IEEE International Conference on Robotics and Automation (ICRA)
ROMar 13, 2024
Real-time 3D semantic occupancy prediction for autonomous vehicles using memory-efficient sparse convolutionSamuel Sze, Lars Kunze
In autonomous vehicles, understanding the surrounding 3D environment of the ego vehicle in real-time is essential. A compact way to represent scenes while encoding geometric distances and semantic object information is via 3D semantic occupancy maps. State of the art 3D mapping methods leverage transformers with cross-attention mechanisms to elevate 2D vision-centric camera features into the 3D domain. However, these methods encounter significant challenges in real-time applications due to their high computational demands during inference. This limitation is particularly problematic in autonomous vehicles, where GPU resources must be shared with other tasks such as localization and planning. In this paper, we introduce an approach that extracts features from front-view 2D camera images and LiDAR scans, then employs a sparse convolution network (Minkowski Engine), for 3D semantic occupancy prediction. Given that outdoor scenes in autonomous driving scenarios are inherently sparse, the utilization of sparse convolution is particularly apt. By jointly solving the problems of 3D scene completion of sparse scenes and 3D semantic segmentation, we provide a more efficient learning framework suitable for real-time applications in autonomous vehicles. We also demonstrate competitive accuracy on the nuScenes dataset.
CYFeb 21, 2024
Testing autonomous vehicles and AI: perspectives and challenges from cybersecurity, transparency, robustness and fairnessDavid Fernández Llorca, Ronan Hamon, Henrik Junklewitz et al.
This study explores the complexities of integrating Artificial Intelligence (AI) into Autonomous Vehicles (AVs), examining the challenges introduced by AI components and the impact on testing procedures, focusing on some of the essential requirements for trustworthy AI. Topics addressed include the role of AI at various operational layers of AVs, the implications of the EU's AI Act on AVs, and the need for new testing methodologies for Advanced Driver Assistance Systems (ADAS) and Automated Driving Systems (ADS). The study also provides a detailed analysis on the importance of cybersecurity audits, the need for explainability in AI decision-making processes and protocols for assessing the robustness and ethical behaviour of predictive systems in AVs. The paper identifies significant challenges and suggests future directions for research and development of AI in AV technology, highlighting the need for multidisciplinary expertise.
ROOct 14, 2024
NAR-*ICP: Neural Execution of Classical ICP-based Pointcloud Registration AlgorithmsEfimia Panagiotaki, Daniele De Martini, Lars Kunze et al.
This study explores the intersection of neural networks and classical robotics algorithms through the Neural Algorithmic Reasoning (NAR) blueprint, enabling the training of neural networks to reason like classical robotics algorithms by learning to execute them. Algorithms are integral to robotics and safety-critical applications due to their predictable and consistent performance through logical and mathematical principles. In contrast, while neural networks are highly adaptable, handling complex, high-dimensional data and generalising across tasks, they often lack interpretability and transparency in their internal computations. To bridge the two, we propose a novel Graph Neural Network (GNN)-based framework, NAR-*ICP, that learns the intermediate computations of classical ICP-based registration algorithms, extending the CLRS Benchmark. We evaluate our approach across real-world and synthetic datasets, demonstrating its flexibility in handling complex inputs, and its potential to be used within larger learning pipelines. Our method achieves superior performance compared to the baselines, even surpassing the algorithms it was trained on, further demonstrating its ability to generalise beyond the capabilities of traditional algorithms.
CVApr 3, 2025
MinkOcc: Towards real-time label-efficient semantic occupancy predictionSamuel Sze, Daniele De Martini, Lars Kunze
Developing 3D semantic occupancy prediction models often relies on dense 3D annotations for supervised learning, a process that is both labor and resource-intensive, underscoring the need for label-efficient or even label-free approaches. To address this, we introduce MinkOcc, a multi-modal 3D semantic occupancy prediction framework for cameras and LiDARs that proposes a two-step semi-supervised training procedure. Here, a small dataset of explicitly 3D annotations warm-starts the training process; then, the supervision is continued by simpler-to-annotate accumulated LiDAR sweeps and images -- semantically labelled through vision foundational models. MinkOcc effectively utilizes these sensor-rich supervisory cues and reduces reliance on manual labeling by 90\% while maintaining competitive accuracy. In addition, the proposed model incorporates information from LiDAR and camera data through early fusion and leverages sparse convolution networks for real-time prediction. With its efficiency in both supervision and computation, we aim to extend MinkOcc beyond curated datasets, enabling broader real-world deployment of 3D semantic occupancy prediction in autonomous driving.
AIMar 18, 2025
Generating Causal Explanations of Vehicular Agent Behavioural Interactions with Learnt Reward ProfilesRhys Howard, Nick Hawes, Lars Kunze
Transparency and explainability are important features that responsible autonomous vehicles should possess, particularly when interacting with humans, and causal reasoning offers a strong basis to provide these qualities. However, even if one assumes agents act to maximise some concept of reward, it is difficult to make accurate causal inferences of agent planning without capturing what is of importance to the agent. Thus our work aims to learn a weighting of reward metrics for agents such that explanations for agent interactions can be causally inferred. We validate our approach quantitatively and qualitatively across three real-world driving datasets, demonstrating a functional improvement over previous methods and competitive performance across evaluation metrics.
CVOct 13, 2025
LikePhys: Evaluating Intuitive Physics Understanding in Video Diffusion Models via Likelihood PreferenceJianhao Yuan, Fabio Pizzati, Francesco Pinto et al.
Intuitive physics understanding in video diffusion models plays an essential role in building general-purpose physically plausible world simulators, yet accurately evaluating such capacity remains a challenging task due to the difficulty in disentangling physics correctness from visual appearance in generation. To the end, we introduce LikePhys, a training-free method that evaluates intuitive physics in video diffusion models by distinguishing physically valid and impossible videos using the denoising objective as an ELBO-based likelihood surrogate on a curated dataset of valid-invalid pairs. By testing on our constructed benchmark of twelve scenarios spanning over four physics domains, we show that our evaluation metric, Plausibility Preference Error (PPE), demonstrates strong alignment with human preference, outperforming state-of-the-art evaluator baselines. We then systematically benchmark intuitive physics understanding in current video diffusion models. Our study further analyses how model design and inference settings affect intuitive physics understanding and highlights domain-specific capacity variations across physical laws. Empirical results show that, despite current models struggling with complex and chaotic dynamics, there is a clear trend of improvement in physics understanding as model capacity and inference settings scale.
ROOct 17, 2024
GraphSCENE: On-Demand Critical Scenario Generation for Autonomous Vehicles in SimulationEfimia Panagiotaki, Georgi Pramatarov, Lars Kunze et al.
Testing and validating Autonomous Vehicle (AV) performance in safety-critical and diverse scenarios is crucial before real-world deployment. However, manually creating such scenarios in simulation remains a significant and time-consuming challenge. This work introduces a novel method that generates dynamic temporal scene graphs corresponding to diverse traffic scenarios, on-demand, tailored to user-defined preferences, such as AV actions, sets of dynamic agents, and criticality levels. A temporal Graph Neural Network (GNN) model learns to predict relationships between ego-vehicle, agents, and static structures, guided by real-world spatiotemporal interaction patterns and constrained by an ontology that restricts predictions to semantically valid links. Our model consistently outperforms the baselines in accurately generating links corresponding to the requested scenarios. We render the predicted scenarios in simulation to further demonstrate their effectiveness as testing environments for AV agents.
LGJun 23, 2024
Learning Run-time Safety Monitors for Machine Learning ComponentsOzan Vardal, Richard Hawkins, Colin Paterson et al.
For machine learning components used as part of autonomous systems (AS) in carrying out critical tasks it is crucial that assurance of the models can be maintained in the face of post-deployment changes (such as changes in the operating environment of the system). A critical part of this is to be able to monitor when the performance of the model at runtime (as a result of changes) poses a safety risk to the system. This is a particularly difficult challenge when ground truth is unavailable at runtime. In this paper we introduce a process for creating safety monitors for ML components through the use of degraded datasets and machine learning. The safety monitor that is created is deployed to the AS in parallel to the ML component to provide a prediction of the safety risk associated with the model output. We demonstrate the viability of our approach through some initial experiments using publicly available speed sign datasets.
AIJun 3, 2024
Extending Structural Causal Models for Autonomous Vehicles to Simplify Temporal System Construction & Enable Dynamic Interactions Between AgentsRhys Howard, Lars Kunze
In this work we aim to bridge the divide between autonomous vehicles and causal reasoning. Autonomous vehicles have come to increasingly interact with human drivers, and in many cases may pose risks to the physical or mental well-being of those they interact with. Meanwhile causal models, despite their inherent transparency and ability to offer contrastive explanations, have found limited usage within such systems. As such, we first identify the challenges that have limited the integration of structural causal models within autonomous vehicles. We then introduce a number of theoretical extensions to the structural causal model formalism in order to tackle these challenges. This augments these models to possess greater levels of modularisation and encapsulation, as well presenting temporal causal model representation with constant space complexity. We also prove through the extensions we have introduced that dynamically mutable sets (e.g. varying numbers of autonomous vehicles across time) can be used within a structural causal model while maintaining a relaxed form of causal stationarity. Finally we discuss the application of the extensions in the context of the autonomous vehicle and service robotics domain along with potential directions for future work.
ROMar 21, 2024
COBRA-PPM: A Causal Bayesian Reasoning Architecture Using Probabilistic Programming for Robot Manipulation Under UncertaintyRicardo Cannizzaro, Michael Groom, Jonathan Routley et al. · oxford
Manipulation tasks require robots to reason about cause and effect when interacting with objects. Yet, many data-driven approaches lack causal semantics and thus only consider correlations. We introduce COBRA-PPM, a novel causal Bayesian reasoning architecture that combines causal Bayesian networks and probabilistic programming to perform interventional inference for robot manipulation under uncertainty. We demonstrate its capabilities through high-fidelity Gazebo-based experiments on an exemplar block stacking task, where it predicts manipulation outcomes with high accuracy (Pred Acc: 88.6%) and performs greedy next-best action selection with a 94.2% task success rate. We further demonstrate sim2real transfer on a domestic robot, showing effectiveness in handling real-world uncertainty from sensor noise and stochastic actions. Our generalised and extensible framework supports a wide range of manipulation scenarios and lays a foundation for future work at the intersection of robotics and causality.
HCMar 19, 2024
Advancing Explainable Autonomous Vehicle Systems: A Comprehensive Review and Research RoadmapSule Tekkesinoglu, Azra Habibovic, Lars Kunze
Given the uncertainty surrounding how existing explainability methods for autonomous vehicles (AVs) meet the diverse needs of stakeholders, a thorough investigation is imperative to determine the contexts requiring explanations and suitable interaction strategies. A comprehensive review becomes crucial to assess the alignment of current approaches with the varied interests and expectations within the AV ecosystem. This study presents a review to discuss the complexities associated with explanation generation and presentation to facilitate the development of more effective and inclusive explainable AV systems. Our investigation led to categorising existing literature into three primary topics: explanatory tasks, explanatory information, and explanatory information communication. Drawing upon our insights, we have proposed a comprehensive roadmap for future research centred on (i) knowing the interlocutor, (ii) generating timely explanations, (ii) communicating human-friendly explanations, and (iv) continuous learning. Our roadmap is underpinned by principles of responsible research and innovation, emphasising the significance of diverse explanation requirements. To effectively tackle the challenges associated with implementing explainable AV systems, we have delineated various research directions, including the development of privacy-preserving data integration, ethical frameworks, real-time analytics, human-centric interaction design, and enhanced cross-disciplinary collaborations. By exploring these research directions, the study aims to guide the development and deployment of explainable AVs, informed by a holistic understanding of user needs, technological advancements, regulatory compliance, and ethical considerations, thereby ensuring safer and more trustworthy autonomous driving experiences.
HCMar 9, 2021
Explanations in Autonomous Driving: A SurveyDaniel Omeiza, Helena Webb, Marina Jirotka et al.
The automotive industry has witnessed an increasing level of development in the past decades; from manufacturing manually operated vehicles to manufacturing vehicles with a high level of automation. With the recent developments in Artificial Intelligence (AI), automotive companies now employ blackbox AI models to enable vehicles to perceive their environments and make driving decisions with little or no input from a human. With the hope to deploy autonomous vehicles (AV) on a commercial scale, the acceptance of AV by society becomes paramount and may largely depend on their degree of transparency, trustworthiness, and compliance with regulations. The assessment of the compliance of AVs to these acceptance requirements can be facilitated through the provision of explanations for AVs' behaviour. Explainability is therefore seen as an important requirement for AVs. AVs should be able to explain what they have 'seen', done, and might do in environments in which they operate. In this paper, we provide a comprehensive survey of the existing body of work around explainable autonomous driving. First, we open with a motivation for explanations by highlighting and emphasising the importance of transparency, accountability, and trust in AVs; and examining existing regulations and standards related to AVs. Second, we identify and categorise the different stakeholders involved in the development, use, and regulation of AVs and elicit their explanation requirements for AV. Third, we provide a rigorous review of previous work on explanations for the different AV operations (i.e., perception, localisation, planning, control, and system management). Finally, we identify pertinent challenges and provide recommendations, such as a conceptual framework for AV explainability. This survey aims to provide the fundamental knowledge required of researchers who are interested in explainability in AVs.
CYMay 5, 2020
Sense-Assess-eXplain (SAX): Building Trust in Autonomous Vehicles in Challenging Real-World Driving ScenariosMatthew Gadd, Daniele De Martini, Letizia Marchegiani et al.
This paper discusses ongoing work in demonstrating research in mobile autonomy in challenging driving scenarios. In our approach, we address fundamental technical issues to overcome critical barriers to assurance and regulation for large-scale deployments of autonomous systems. To this end, we present how we build robots that (1) can robustly sense and interpret their environment using traditional as well as unconventional sensors; (2) can assess their own capabilities; and (3), vitally in the purpose of assurance and trust, can provide causal explanations of their interpretations and assessments. As it is essential that robots are safe and trusted, we design, develop, and demonstrate fundamental technologies in real-world applications to overcome critical barriers which impede the current deployment of robots in economically and socially important areas. Finally, we describe ongoing work in the collection of an unusual, rare, and highly valuable dataset.
ROMar 10, 2020
LiDAR Lateral Localisation Despite Challenging Occlusion from TrafficTarlan Suleymanov, Matthew Gadd, Lars Kunze et al.
This paper presents a system for improving the robustness of LiDAR lateral localisation systems. This is made possible by including detections of road boundaries which are invisible to the sensor (due to occlusion, e.g. traffic) but can be located by our Occluded Road Boundary Inference Deep Neural Network. We show an example application in which fusion of a camera stream is used to initialise the lateral localisation. We demonstrate over four driven forays through central Oxford - totalling 40 km of driving - a gain in performance that inferring of occluded road boundaries brings.
ROJul 11, 2019
Online Inference and Detection of Curbs in Partially Occluded Scenes with Sparse LIDARTarlan Suleymanov, Lars Kunze, Paul Newman
Road boundaries, or curbs, provide autonomous vehicles with essential information when interpreting road scenes and generating behaviour plans. Although curbs convey important information, they are difficult to detect in complex urban environments (in particular in comparison to other elements of the road such as traffic signs and road markings). These difficulties arise from occlusions by other traffic participants as well as changing lighting and/or weather conditions. Moreover, road boundaries have various shapes, colours and structures while motion planning algorithms require accurate and precise metric information in real-time to generate their plans. In this paper, we present a real-time LIDAR-based approach for accurate curb detection around the vehicle (360 degree). Our approach deals with both occlusions from traffic and changing environmental conditions. To this end, we project 3D LIDAR pointcloud data into 2D bird's-eye view images (akin to Inverse Perspective Mapping). These images are then processed by trained deep networks to infer both visible and occluded road boundaries. Finally, a post-processing step filters detected curb segments and tracks them over time. Experimental results demonstrate the effectiveness of the proposed approach on real-world driving data. Hence, we believe that our LIDAR-based approach provides an efficient and effective way to detect visible and occluded curbs around the vehicles in challenging driving scenarios.
ROJul 10, 2019
Generating All the Roads to Rome: Road Layout Randomization for Improved Road Marking SegmentationTom Bruls, Horia Porav, Lars Kunze et al.
Road markings provide guidance to traffic participants and enforce safe driving behaviour, understanding their semantic meaning is therefore paramount in (automated) driving. However, producing the vast quantities of road marking labels required for training state-of-the-art deep networks is costly, time-consuming, and simply infeasible for every domain and condition. In addition, training data retrieved from virtual worlds often lack the richness and complexity of the real world and consequently cannot be used directly. In this paper, we provide an alternative approach in which new road marking training pairs are automatically generated. To this end, we apply principles of domain randomization to the road layout and synthesize new images from altered semantic labels. We demonstrate that training on these synthetic pairs improves mIoU of the segmentation of rare road marking classes during real-world deployment in complex urban environments by more than 12 percentage points, while performance for other classes is retained. This framework can easily be scaled to all domains and conditions to generate large-scale road marking datasets, while avoiding manual labelling effort.
CVDec 3, 2018
The Right (Angled) Perspective: Improving the Understanding of Road Scenes Using Boosted Inverse Perspective MappingTom Bruls, Horia Porav, Lars Kunze et al.
Many tasks performed by autonomous vehicles such as road marking detection, object tracking, and path planning are simpler in bird's-eye view. Hence, Inverse Perspective Mapping (IPM) is often applied to remove the perspective effect from a vehicle's front-facing camera and to remap its images into a 2D domain, resulting in a top-down view. Unfortunately, however, this leads to unnatural blurring and stretching of objects at further distance, due to the resolution of the camera, limiting applicability. In this paper, we present an adversarial learning approach for generating a significantly improved IPM from a single camera image in real time. The generated bird's-eye-view images contain sharper features (e.g. road markings) and a more homogeneous illumination, while (dynamic) objects are automatically removed from the scene, thus revealing the underlying road layout in an improved fashion. We demonstrate our framework using real-world data from the Oxford RobotCar Dataset and show that scene understanding tasks directly benefit from our boosted IPM approach.
ROJul 13, 2018
Artificial Intelligence for Long-Term Robot Autonomy: A SurveyLars Kunze, Nick Hawes, Tom Duckett et al.
Autonomous systems will play an essential role in many applications across diverse domains including space, marine, air, field, road, and service robotics. They will assist us in our daily routines and perform dangerous, dirty and dull tasks. However, enabling robotic systems to perform autonomously in complex, real-world scenarios over extended time periods (i.e. weeks, months, or years) poses many challenges. Some of these have been investigated by sub-disciplines of Artificial Intelligence (AI) including navigation & mapping, perception, knowledge representation & reasoning, planning, interaction, and learning. The different sub-disciplines have developed techniques that, when re-integrated within an autonomous system, can enable robots to operate effectively in complex, long-term scenarios. In this paper, we survey and discuss AI techniques as 'enablers' for long-term robot autonomy, current progress in integrating these techniques within long-running robotic systems, and the future challenges and opportunities for AI in long-term autonomy.
ROApr 15, 2016
The STRANDS Project: Long-Term Autonomy in Everyday EnvironmentsNick Hawes, Chris Burbridge, Ferdian Jovan et al.
Thanks to the efforts of the robotics and autonomous systems community, robots are becoming ever more capable. There is also an increasing demand from end-users for autonomous service robots that can operate in real environments for extended periods. In the STRANDS project we are tackling this demand head-on by integrating state-of-the-art artificial intelligence and robotics research into mobile service robots, and deploying these systems for long-term installations in security and care environments. Over four deployments, our robots have been operational for a combined duration of 104 days autonomously performing end-user defined tasks, covering 116km in the process. In this article we describe the approach we have used to enable long-term autonomous operation in everyday environments, and how our robots are able to use their long run times to improve their own performance.