Fabio Arnez

CV
h-index15
6papers
50citations
Novelty37%
AI Score38

6 Papers

ROJan 12, 2023
Towards Dependable Autonomous Systems Based on Bayesian Deep Learning Components

Fabio Arnez, Huascar Espinoza, Ansgar Radermacher et al.

As autonomous systems increasingly rely on Deep Neural Networks (DNN) to implement the navigation pipeline functions, uncertainty estimation methods have become paramount for estimating confidence in DNN predictions. Bayesian Deep Learning (BDL) offers a principled approach to model uncertainties in DNNs. However, in DNN-based systems, not all the components use uncertainty estimation methods and typically ignore the uncertainty propagation between them. This paper provides a method that considers the uncertainty and the interaction between BDL components to capture the overall system uncertainty. We study the effect of uncertainty propagation in a BDL-based system for autonomous aerial navigation. Experiments show that our approach allows us to capture useful uncertainty estimates while slightly improving the system's performance in its final task. In addition, we discuss the benefits, challenges, and implications of adopting BDL to build dependable autonomous systems.

LGNov 13, 2025
The Map of Misbelief: Tracing Intrinsic and Extrinsic Hallucinations Through Attention Patterns

Elyes Hajji, Aymen Bouguerra, Fabio Arnez

Large Language Models (LLMs) are increasingly deployed in safety-critical domains, yet remain susceptible to hallucinations. While prior works have proposed confidence representation methods for hallucination detection, most of these approaches rely on computationally expensive sampling strategies and often disregard the distinction between hallucination types. In this work, we introduce a principled evaluation framework that differentiates between extrinsic and intrinsic hallucination categories and evaluates detection performance across a suite of curated benchmarks. In addition, we leverage a recent attention-based uncertainty quantification algorithm and propose novel attention aggregation strategies that improve both interpretability and hallucination detection performance. Our experimental findings reveal that sampling-based methods like Semantic Entropy are effective for detecting extrinsic hallucinations but generally fail on intrinsic ones. In contrast, our method, which aggregates attention over input tokens, is better suited for intrinsic hallucinations. These insights provide new directions for aligning detection strategies with the nature of hallucination and highlight attention as a rich signal for quantifying model uncertainty.

CVSep 25, 2025
Can Less Precise Be More Reliable? A Systematic Evaluation of Quantization's Impact on CLIP Beyond Accuracy

Aymen Bouguerra, Daniel Montoya, Alexandra Gomez-Villa et al.

The powerful zero-shot generalization capabilities of vision-language models (VLMs) like CLIP have enabled new paradigms for safety-related tasks such as out-of-distribution (OOD) detection. However, additional aspects crucial for the computationally efficient and reliable deployment of CLIP are still overlooked. In particular, the impact of quantization on CLIP's performance beyond accuracy remains underexplored. This work presents a large-scale evaluation of quantization on CLIP models, assessing not only in-distribution accuracy but a comprehensive suite of reliability metrics and revealing counterintuitive results driven by pre-training source. We demonstrate that quantization consistently improves calibration for typically underconfident pre-trained models, while often degrading it for overconfident variants. Intriguingly, this degradation in calibration does not preclude gains in other reliability metrics; we find that OOD detection can still improve for these same poorly calibrated models. Furthermore, we identify specific quantization-aware training (QAT) methods that yield simultaneous gains in zero-shot accuracy, calibration, and OOD robustness, challenging the view of a strict efficiency-performance trade-off. These findings offer critical insights for navigating the multi-objective problem of deploying efficient, reliable, and robust VLMs by utilizing quantization beyond its conventional role.

CVJun 16, 2025
FindMeIfYouCan: Bringing Open Set metrics to $\textit{near} $, $ \textit{far} $ and $\textit{farther}$ Out-of-Distribution Object Detection

Daniel Montoya, Aymen Bouguerra, Alexandra Gomez-Villa et al.

State-of-the-art Object Detection (OD) methods predominantly operate under a closed-world assumption, where test-time categories match those encountered during training. However, detecting and localizing unknown objects is crucial for safety-critical applications in domains such as autonomous driving and medical imaging. Recently, Out-Of-Distribution (OOD) detection has emerged as a vital research direction for OD, focusing on identifying incorrect predictions typically associated with unknown objects. This paper shows that the current evaluation protocol for OOD-OD violates the assumption of non-overlapping objects with respect to the In-Distribution (ID) datasets, and obscures crucial situations such as ignoring unknown objects, potentially leading to overconfidence in deployment scenarios where truly novel objects might be encountered. To address these limitations, we manually curate, and enrich the existing benchmark by exploiting semantic similarity to create new evaluation splits categorized as $\textit{near}$, $\textit{far}$, and $\textit{farther}$ from ID distributions. Additionally, we incorporate established metrics from the Open Set community, providing deeper insights into how effectively methods detect unknowns, when they ignore them, and when they mistakenly classify OOD objects as ID. Our comprehensive evaluation demonstrates that semantically and visually close OOD objects are easier to localize than far ones, but are also more easily confounded with ID objects. $\textit{Far}$ and $\textit{farther}$ objects are harder to localize but less prone to be taken for an ID object.

ROOct 26, 2021
Improving Robustness of Deep Neural Networks for Aerial Navigation by Incorporating Input Uncertainty

Fabio Arnez, Huascar Espinoza, Ansgar Radermacher et al.

Uncertainty quantification methods are required in autonomous systems that include deep learning (DL) components to assess the confidence of their estimations. However, to successfully deploy DL components in safety-critical autonomous systems, they should also handle uncertainty at the input rather than only at the output of the DL components. Considering a probability distribution in the input enables the propagation of uncertainty through different components to provide a representative measure of the overall system uncertainty. In this position paper, we propose a method to account for uncertainty at the input of Bayesian Deep Learning control policies for Aerial Navigation. Our early experiments show that the proposed method improves the robustness of the navigation policy in Out-of-Distribution (OoD) scenarios.

LGJun 26, 2020
A Comparison of Uncertainty Estimation Approaches in Deep Learning Components for Autonomous Vehicle Applications

Fabio Arnez, Huascar Espinoza, Ansgar Radermacher et al.

A key factor for ensuring safety in Autonomous Vehicles (AVs) is to avoid any abnormal behaviors under undesirable and unpredicted circumstances. As AVs increasingly rely on Deep Neural Networks (DNNs) to perform safety-critical tasks, different methods for uncertainty quantification have recently been proposed to measure the inevitable source of errors in data and models. However, uncertainty quantification in DNNs is still a challenging task. These methods require a higher computational load, a higher memory footprint, and introduce extra latency, which can be prohibitive in safety-critical applications. In this paper, we provide a brief and comparative survey of methods for uncertainty quantification in DNNs along with existing metrics to evaluate uncertainty predictions. We are particularly interested in understanding the advantages and downsides of each method for specific AV tasks and types of uncertainty sources.