20.0LGApr 24
On the Properties of Feature Attribution for Supervised Contrastive LearningLeonardo Arrighi, Julia Eva Belloni, Aurélie Gallet et al.
Most Neural Networks (NNs) for classification are trained using Cross-Entropy as a loss function. This approach requires the model to have an explicit classification layer. However, there exist alternative approaches, such as Contrastive Learning (CL). Instead of explicitly operating a classification, CL has the NN produce an embedding space where projections of similar data are pulled together, while projections of dissimilar data are pushed apart. In the case of Supervised CL (SCL), labels are adopted as similarity criteria, thus creating an embedding space where the projected data points are well-clustered. SCL provides crucial advantages over CE with regard to adversarial robustness and out-of-distribution detection, thus making it a more natural choice in safety-critical scenarios. In the present paper, we empirically show that NNs for image classification trained with SCL present higher-quality feature attribution explanations than CL with regard to faithfulness, complexity, and continuity. These results reinforce previous findings about CL-based approaches when targeting more trustworthy and transparent NNs and can guide practitioners in the selection of training objectives targeting not only accuracy, but also transparency of the models.
30.3LGMar 25
No Single Metric Tells the Whole Story: A Multi-Dimensional Evaluation Framework for Uncertainty AttributionsEmily Schiller, Teodor Chiaburu, Marco Zullich et al.
Research on explainable AI (XAI) has frequently focused on explaining model predictions. More recently, methods have been proposed to explain prediction uncertainty by attributing it to input features (uncertainty attributions). However, the evaluation of these methods remains inconsistent as studies rely on heterogeneous proxy tasks and metrics, hindering comparability. We address this by aligning uncertainty attributions with the well-established Co-12 framework for XAI evaluation. We propose concrete implementations for the correctness, consistency, continuity, and compactness properties. Additionally, we introduce conveyance, a property tailored to uncertainty attributions that evaluates whether controlled increases in epistemic uncertainty reliably propagate to feature-level attributions. We demonstrate our evaluation framework with eight metrics across combinations of uncertainty quantification and feature attribution methods on tabular and image data. Our experiments show that gradient-based methods consistently outperform perturbation-based approaches in consistency and conveyance, while Monte-Carlo dropconnect outperforms Monte-Carlo dropout in most metrics. Although most metrics rank the methods consistently across samples, inter-method agreement remains low. This suggests no single metric sufficiently evaluates uncertainty attribution quality. The proposed evaluation framework contributes to the body of knowledge by establishing a foundation for systematic comparison and development of uncertainty attribution methods.
IVDec 19, 2024
Uncertainty Estimation for Super-Resolution using ESRGANManiraj Sai Adapa, Marco Zullich, Matias Valdenegro-Toro
Deep Learning-based image super-resolution (SR) has been gaining traction with the aid of Generative Adversarial Networks. Models like SRGAN and ESRGAN are constantly ranked between the best image SR tools. However, they lack principled ways for estimating predictive uncertainty. In the present work, we enhance these models using Monte Carlo-Dropout and Deep Ensemble, allowing the computation of predictive uncertainty. When coupled with a prediction, uncertainty estimates can provide more information to the model users, highlighting pixels where the SR output might be uncertain, hence potentially inaccurate, if these estimates were to be reliable. Our findings suggest that these uncertainty estimates are decently calibrated and can hence fulfill this goal, while providing no performance drop with respect to the corresponding models without uncertainty estimation.
LGAug 9, 2025
Sparsity-Driven Plasticity in Multi-Task Reinforcement LearningAleksandar Todorov, Juan Cardenas-Cartagena, Rafael F. Cunha et al.
Plasticity loss, a diminishing capacity to adapt as training progresses, is a critical challenge in deep reinforcement learning. We examine this issue in multi-task reinforcement learning (MTRL), where higher representational flexibility is crucial for managing diverse and potentially conflicting task demands. We systematically explore how sparsification methods, particularly Gradual Magnitude Pruning (GMP) and Sparse Evolutionary Training (SET), enhance plasticity and consequently improve performance in MTRL agents. We evaluate these approaches across distinct MTRL architectures (shared backbone, Mixture of Experts, Mixture of Orthogonal Experts) on standardized MTRL benchmarks, comparing against dense baselines, and a comprehensive range of alternative plasticity-inducing or regularization methods. Our results demonstrate that both GMP and SET effectively mitigate key indicators of plasticity degradation, such as neuron dormancy and representational collapse. These plasticity improvements often correlate with enhanced multi-task performance, with sparse agents frequently outperforming dense counterparts and achieving competitive results against explicit plasticity interventions. Our findings offer insights into the interplay between plasticity, network sparsity, and MTRL designs, highlighting dynamic sparsification as a robust but context-sensitive tool for developing more adaptable MTRL systems.
AIApr 12, 2025
Explainable Artificial Intelligence techniques for interpretation of food datasets: a reviewLeonardo Arrighi, Ingrid Alves de Moraes, Marco Zullich et al.
Artificial Intelligence (AI) has become essential for analyzing complex data and solving highly-challenging tasks. It is being applied across numerous disciplines beyond computer science, including Food Engineering, where there is a growing demand for accurate and trustworthy predictions to meet stringent food quality standards. However, this requires increasingly complex AI models, raising reliability concerns. In response, eXplainable AI (XAI) has emerged to provide insights into AI decision-making, aiding model interpretation by developers and users. Nevertheless, XAI remains underutilized in Food Engineering, limiting model reliability. For instance, in food quality control, AI models using spectral imaging can detect contaminants or assess freshness levels, but their opaque decision-making process hinders adoption. XAI techniques such as SHAP (Shapley Additive Explanations) and Grad-CAM (Gradient-weighted Class Activation Mapping) can pinpoint which spectral wavelengths or image regions contribute most to a prediction, enhancing transparency and aiding quality control inspectors in verifying AI-generated assessments. This survey presents a taxonomy for classifying food quality research using XAI techniques, organized by data types and explanation methods, to guide researchers in choosing suitable approaches. We also highlight trends, challenges, and opportunities to encourage the adoption of XAI in Food Engineering.
CVJan 21, 2025
Large-image Object Detection for Fine-grained Recognition of Punches Patterns in Medieval Panel PaintingJosh Bruegger, Diana Ioana Catana, Vanja Macovaz et al.
The attribution of the author of an art piece is typically a laborious manual process, usually relying on subjective evaluations of expert figures. However, there are some situations in which quantitative features of the artwork can support these evaluations. The extraction of these features can sometimes be automated, for instance, with the use of Machine Learning (ML) techniques. An example of these features is represented by repeated, mechanically impressed patterns, called punches, present chiefly in 13th and 14th-century panel paintings from Tuscany. Previous research in art history showcased a strong connection between the shapes of punches and specific artists or workshops, suggesting the possibility of using these quantitative cues to support the attribution. In the present work, we first collect a dataset of large-scale images of these panel paintings. Then, using YOLOv10, a recent and popular object detection model, we train a ML pipeline to perform object detection on the punches contained in the images. Due to the large size of the images, the detection procedure is split across multiple frames by adopting a sliding-window approach with overlaps, after which the predictions are combined for the whole image using a custom non-maximal suppression routine. Our results indicate how art historians working in the field can reliably use our method for the identification and extraction of punches.
CVDec 17, 2025
Enhancing Tree Species Classification: Insights from YOLOv8 and Explainable AI Applied to TLS Point Cloud ProjectionsAdrian Straker, Paul Magdon, Marco Zullich et al.
Classifying tree species has been a core research area in forest remote sensing for decades. New sensors and classification approaches like TLS and deep learning achieve state-of-the art accuracy but their decision processes remain unclear. Methods such as Finer-CAM (Class Activation Mapping) can highlight features in TLS projections that contribute to the classification of a target species, yet are uncommon in similar looking contrastive tree species. We propose a novel method linking Finer-CAM explanations to segments of TLS projections representing structural tree features to systemically evaluate which features drive species discrimination. Using TLS data from 2,445 trees across seven European tree species, we trained and validated five YOLOv8 models with cross-validation, reaching a mean accuracy of 96% (SD = 0.24%). Analysis of 630 saliency maps shows the models primarily rely on crown features in TLS projections for species classification. While this result is pronounced in Silver Birch, European Beech, English oak, and Norway spruce, stem features contribute more frequently to the differentiation of European ash, Scots pine, and Douglas fir. Particularly representations of finer branches contribute to the decisions of the models. The models consider those tree species similar to each other which a human expert would also regard as similar. Furthermore, our results highlight the need for an improved understanding of the decision processes of tree species classification models to help reveal data set and model limitations, biases, and to build confidence in model predictions.
CLSep 23, 2025
Uncertainty in Semantic Language Modeling with PIXELSStefania Radu, Marco Zullich, Matias Valdenegro-Toro
Pixel-based language models aim to solve the vocabulary bottleneck problem in language modeling, but the challenge of uncertainty quantification remains open. The novelty of this work consists of analysing uncertainty and confidence in pixel-based language models across 18 languages and 7 scripts, all part of 3 semantically challenging tasks. This is achieved through several methods such as Monte Carlo Dropout, Transformer Attention, and Ensemble Learning. The results suggest that pixel-based models underestimate uncertainty when reconstructing patches. The uncertainty is also influenced by the script, with Latin languages displaying lower uncertainty. The findings on ensemble learning show better performance when applying hyperparameter tuning during the named entity recognition and question-answering tasks across 16 languages.
LGJan 14, 2025
Can Bayesian Neural Networks Explicitly Model Input Uncertainty?Matias Valdenegro-Toro, Marco Zullich
Inputs to machine learning models can have associated noise or uncertainties, but they are often ignored and not modelled. It is unknown if Bayesian Neural Networks and their approximations are able to consider uncertainty in their inputs. In this paper we build a two input Bayesian Neural Network (mean and standard deviation) and evaluate its capabilities for input uncertainty estimation across different methods like Ensembles, MC-Dropout, and Flipout. Our results indicate that only some uncertainty estimation methods for approximate Bayesian NNs can model input uncertainty, in particular Ensembles and Flipout.
CVDec 19, 2024
Adaptive Prompt Tuning: Vision Guided Prompt Tuning with Cross-Attention for Fine-Grained Few-Shot LearningEric Brouwer, Jan Erik van Woerden, Gertjan Burghouts et al.
Few-shot, fine-grained classification in computer vision poses significant challenges due to the need to differentiate subtle class distinctions with limited data. This paper presents a novel method that enhances the Contrastive Language-Image Pre-Training (CLIP) model through adaptive prompt tuning, guided by real-time visual inputs. Unlike existing techniques such as Context Optimization (CoOp) and Visual Prompt Tuning (VPT), which are constrained by static prompts or visual token reliance, the proposed approach leverages a cross-attention mechanism to dynamically refine text prompts for the image at hand. This enables an image-specific alignment of textual features with image patches extracted from the Vision Transformer, making the model more effective for datasets with high intra-class variance and low inter-class differences. The method is evaluated on several datasets, including CUBirds, Oxford Flowers, and FGVC Aircraft, showing significant performance gains over static prompt tuning approaches. To ensure these performance gains translate into trustworthy predictions, we integrate Monte-Carlo Dropout in our approach to improve the reliability of the model predictions and uncertainty estimates. This integration provides valuable insights into the model's predictive confidence, helping to identify when predictions can be trusted and when additional verification is necessary. This dynamic approach offers a robust solution, advancing the state-of-the-art for few-shot fine-grained classification.
LGNov 18, 2024
Upside-Down Reinforcement Learning for More Interpretable Optimal ControlJuan Cardenas-Cartagena, Massimiliano Falzari, Marco Zullich et al.
Model-Free Reinforcement Learning (RL) algorithms either learn how to map states to expected rewards or search for policies that can maximize a certain performance function. Model-Based algorithms instead, aim to learn an approximation of the underlying model of the RL environment and then use it in combination with planning algorithms. Upside-Down Reinforcement Learning (UDRL) is a novel learning paradigm that aims to learn how to predict actions from states and desired commands. This task is formulated as a Supervised Learning problem and has successfully been tackled by Neural Networks (NNs). In this paper, we investigate whether function approximation algorithms other than NNs can also be used within a UDRL framework. Our experiments, performed over several popular optimal control benchmarks, show that tree-based methods like Random Forests and Extremely Randomized Trees can perform just as well as NNs with the significant benefit of resulting in policies that are inherently more interpretable than NNs, therefore paving the way for more transparent, safe, and robust RL.
LGJun 26, 2024
Unified Uncertainties: Combining Input, Data and Model Uncertainty into a Single FormulationMatias Valdenegro-Toro, Ivo Pascal de Jong, Marco Zullich
Modelling uncertainty in Machine Learning models is essential for achieving safe and reliable predictions. Most research on uncertainty focuses on output uncertainty (predictions), but minimal attention is paid to uncertainty at inputs. We propose a method for propagating uncertainty in the inputs through a Neural Network that is simultaneously able to estimate input, data, and model uncertainty. Our results show that this propagation of input uncertainty results in a more stable decision boundary even under large amounts of input noise than comparatively simple Monte Carlo sampling. Additionally, we discuss and demonstrate that input uncertainty, when propagated through the model, results in model uncertainty at the outputs. The explicit incorporation of input uncertainty may be beneficial in situations where the amount of input uncertainty is known, though good datasets for this are still needed.