Ivo Pascal de Jong

LG
h-index26
11papers
33citations
Novelty35%
AI Score46

11 Papers

SPNov 15, 2023
Uncertainty Quantification in Machine Learning for Biosignal Applications -- A Review

Ivo Pascal de Jong, Andreea Ioana Sburlea, Matias Valdenegro-Toro

Uncertainty Quantification (UQ) has gained traction in an attempt to improve the interpretability and robustness of machine learning predictions. Specifically (medical) biosignals such as electroencephalography (EEG), electrocardiography (ECG), electrooculography (EOG), and electromyography (EMG) could benefit from good UQ, since these suffer from a poor signal-to-noise ratio, and good human interpretability is pivotal for medical applications. In this paper, we review the state of the art of applying Uncertainty Quantification to Machine Learning tasks in the biosignal domain. We present various methods, shortcomings, uncertainty measures and theoretical frameworks that currently exist in this application domain. We address misconceptions in the field, provide recommendations for future work, and discuss gaps in the literature in relation to diagnostic implementations as well as control for prostheses or brain-computer interfaces. Overall it can be concluded that promising UQ methods are available, but that research is needed on how people and systems may interact with an uncertainty-model in a (clinical) environment

LGAug 22, 2024
Measuring Orthogonality as the Blind-Spot of Uncertainty Disentanglement

Ivo Pascal de Jong, Andreea Ioana Sburlea, Matthia Sabatelli et al.

Aleatoric (data) and epistemic (knowledge) uncertainty are textbook components of Uncertainty Quantification. Jointly estimating these components has been shown to be problematic and non-trivial. As a result, there are multiple ways to disentangle these uncertainties, but current methods to evaluate them are insufficient. We propose that aleatoric and epistemic uncertainty estimates should be orthogonally disentangled - meaning that each uncertainty is not affected by the other - a necessary condition that is often not met. We prove that orthogonality and consistency and necessary and sufficient criteria for disentanglement, and construct Uncertainty Disentanglement Error as a metric to measure these criteria, with further empirical evaluation showing that finetuned models give different orthogonality results than models trained from scratch and that UDE can be optimized for through dropout rate. We demonstrate a Deep Ensemble trained from scratch on ImageNet-1k with Information Theoretic disentangling achieves consistent and orthogonal estimates of epistemic uncertainty, but estimates of aleatoric uncertainty still fail on orthogonality.

LGMar 4
The Challenge of Out-Of-Distribution Detection in Motor Imagery BCIs

Merlijn Quincent Mulder, Matias Valdenegro-Toro, Andreea Ioana Sburlea et al.

Machine Learning classifiers used in Brain-Computer Interfaces make classifications based on the distribution of data they were trained on. When they need to make inferences on samples that fall outside of this distribution, they can only make blind guesses. Instead of allowing random guesses, these Out-of-Distribution (OOD) samples should be detected and rejected. We study OOD detection in Motor Imagery BCIs by training a model on some classes and observing whether unfamiliar classes can be detected based on increased uncertainty. We test seven different OOD detection techniques and one more method that has been claimed to boost the quality of OOD detection. Our findings show that OOD detection for Brain-Computer Interfaces is more challenging than in other machine learning domains due to the high uncertainty inherent in classifying EEG signals. For many subjects, uncertainty for in-distribution classes can still be higher than for out-of-distribution classes. As a result, many OOD detection methods prove to be ineffective, though MC Dropout performed best. Additionally, we show that high in-distribution classification performance predicts high OOD detection performance, suggesting that improved accuracy can also lead to improved robustness. Our research demonstrates a setup for studying how models deal with unfamiliar EEG data and evaluates methods that are robust to these unfamiliar inputs. OOD detection can improve the overall safety and reliability of BCIs.

LGMar 11
Riemannian Geometry-Preserving Variational Autoencoder for MI-BCI Data Augmentation

Viktorija Poļaka, Ivo Pascal de Jong, Andreea Ioana Sburlea

This paper addresses the challenge of generating synthetic electroencephalogram (EEG) covariance matrices for motor imagery brain-computer interface (MI-BCI) applications. Objective: We aim to develop a generative model capable of producing high-fidelity synthetic covariance matrices while preserving their symmetric positive-definite nature. Approach: We propose a Riemannian geometry-preserving variational autoencoder (RGP-VAE) integrating geometric mappings with a composite loss function combining Riemannian distance, tangent space reconstruction accuracy and generative diversity. Results: The model generates valid, representative EEG covariance matrices, while learning a subject-invariant latent space. Synthetic data proves practically useful for MI-BCI, with its impact depending on the paired classifier. Contribution: This work introduces and validates the RGP-VAE as a geometry-preserving generative model for EEG covariance matrices, highlighting its potential for signal privacy, scalability and data augmentation.

CVApr 4, 2025
Know What You do Not Know: Verbalized Uncertainty Estimation Robustness on Corrupted Images in Vision-Language Models

Mirko Borszukovszki, Ivo Pascal de Jong, Matias Valdenegro-Toro

To leverage the full potential of Large Language Models (LLMs) it is crucial to have some information on their answers' uncertainty. This means that the model has to be able to quantify how certain it is in the correctness of a given response. Bad uncertainty estimates can lead to overconfident wrong answers undermining trust in these models. Quite a lot of research has been done on language models that work with text inputs and provide text outputs. Still, since the visual capabilities have been added to these models recently, there has not been much progress on the uncertainty of Visual Language Models (VLMs). We tested three state-of-the-art VLMs on corrupted image data. We found that the severity of the corruption negatively impacted the models' ability to estimate their uncertainty and the models also showed overconfidence in most of the experiments.

IRMar 9
Why Large Language Models can Secretly Outperform Embedding Similarity in Information Retrieval

Matei Benescu, Ivo Pascal de Jong

With the emergence of Large Language Models (LLMs), new methods in Information Retrieval are available in which relevance is estimated directly through language understanding and reasoning, instead of embedding similarity. We argue that similarity is a short-sighted interpretation of relevance, and that LLM-Based Relevance Judgment Systems (LLM-RJS) (with reasoning) have potential to outperform Neural Embedding Retrieval Systems (NERS) by overcoming this limitation. Using the TREC-DL 2019 passage retrieval dataset, we compare various LLM-RJS with NERS, but observe no noticeable improvement. Subsequently, we analyze the impact of reasoning by comparing LLM-RJS with and without reasoning. We find that human annotations also suffer from short-sightedness, and that false-positives in the reasoning LLM-RJS are primarily mistakes in annotations due to short-sightedness. We conclude that LLM-RJS do have the ability to address the short-sightedness limitation in NERS, but that this cannot be evaluated with standard annotated relevance datasets.

CLAug 5, 2025
NLP Methods May Actually Be Better Than Professors at Estimating Question Difficulty

Leonidas Zotos, Ivo Pascal de Jong, Matias Valdenegro-Toro et al.

Estimating the difficulty of exam questions is essential for developing good exams, but professors are not always good at this task. We compare various Large Language Model-based methods with three professors in their ability to estimate what percentage of students will give correct answers on True/False exam questions in the areas of Neural Networks and Machine Learning. Our results show that the professors have limited ability to distinguish between easy and difficult questions and that they are outperformed by directly asking Gemini 2.5 to solve this task. Yet, we obtained even better results using uncertainties of the LLMs solving the questions in a supervised learning setting, using only 42 training samples. We conclude that supervised learning using LLM uncertainty can help professors better estimate the difficulty of exam questions, improving the quality of assessment.

LGJul 10, 2025
Uncertainty Quantification for Motor Imagery BCI -- Machine Learning vs. Deep Learning

Joris Suurmeijer, Ivo Pascal de Jong, Matias Valdenegro-Toro et al.

Brain-computer interfaces (BCIs) turn brain signals into functionally useful output, but they are not always accurate. A good Machine Learning classifier should be able to indicate how confident it is about a given classification, by giving a probability for its classification. Standard classifiers for Motor Imagery BCIs do give such probabilities, but research on uncertainty quantification has been limited to Deep Learning. We compare the uncertainty quantification ability of established BCI classifiers using Common Spatial Patterns (CSP-LDA) and Riemannian Geometry (MDRM) to specialized methods in Deep Learning (Deep Ensembles and Direct Uncertainty Quantification) as well as standard Convolutional Neural Networks (CNNs). We found that the overconfidence typically seen in Deep Learning is not a problem in CSP-LDA and MDRM. We found that MDRM is underconfident, which we solved by adding Temperature Scaling (MDRM-T). CSP-LDA and MDRM-T give the best uncertainty estimates, but Deep Ensembles and standard CNNs give the best classifications. We show that all models are able to separate between easy and difficult estimates, so that we can increase the accuracy of a Motor Imagery BCI by rejecting samples that are ambiguous.

LGJun 26, 2024
Unified Uncertainties: Combining Input, Data and Model Uncertainty into a Single Formulation

Matias Valdenegro-Toro, Ivo Pascal de Jong, Marco Zullich

Modelling uncertainty in Machine Learning models is essential for achieving safe and reliable predictions. Most research on uncertainty focuses on output uncertainty (predictions), but minimal attention is paid to uncertainty at inputs. We propose a method for propagating uncertainty in the inputs through a Neural Network that is simultaneously able to estimate input, data, and model uncertainty. Our results show that this propagation of input uncertainty results in a more stable decision boundary even under large amounts of input noise than comparatively simple Monte Carlo sampling. Additionally, we discuss and demonstrate that input uncertainty, when propagated through the model, results in model uncertainty at the outputs. The explicit incorporation of input uncertainty may be beneficial in situations where the amount of input uncertainty is known, though good datasets for this are still needed.

LGMar 14, 2024
Uncertainty Quantification for cross-subject Motor Imagery classification

Prithviraj Manivannan, Ivo Pascal de Jong, Matias Valdenegro-Toro et al.

Uncertainty Quantification aims to determine when the prediction from a Machine Learning model is likely to be wrong. Computer Vision research has explored methods for determining epistemic uncertainty (also known as model uncertainty), which should correspond with generalisation error. These methods theoretically allow to predict misclassifications due to inter-subject variability. We applied a variety of Uncertainty Quantification methods to predict misclassifications for a Motor Imagery Brain Computer Interface. Deep Ensembles performed best, both in terms of classification performance and cross-subject Uncertainty Quantification performance. However, we found that standard CNNs with Softmax output performed better than some of the more advanced methods.

SPMar 14, 2024
Transferring BCI models from calibration to control: Observing shifts in EEG features

Ivo Pascal de Jong, Lüke Luna van den Wittenboer, Matias Valdenegro-Toro et al.

Public Motor Imagery-based brain-computer interface (BCI) datasets are being used to develop increasingly good classifiers. However, they usually follow discrete paradigms where participants perform Motor Imagery at regularly timed intervals. It is often unclear what changes may happen in the EEG patterns when users attempt to perform a control task with such a BCI. This may lead to generalisation errors. We demonstrate a new paradigm containing a standard calibration session and a novel BCI control session based on EMG. This allows us to observe similarities in sensorimotor rhythms, and observe the additional preparation effects introduced by the control paradigm. In the Movement Related Cortical Potentials we found large differences between the calibration and control sessions. We demonstrate a CSP-based Machine Learning model trained on the calibration data that can make surprisingly good predictions on the BCI-controlled driving data.