Andrea Visentin

LG
h-index9
11papers
65citations
Novelty27%
AI Score43

11 Papers

DCFeb 24, 2023
Forecasting Workload in Cloud Computing: Towards Uncertainty-Aware Predictions and Transfer Learning

Andrea Rossi, Andrea Visentin, Diego Carraro et al.

Predicting future resource demand in Cloud Computing is essential for optimizing the trade-off between serving customers' requests efficiently and minimizing the provisioning cost. Modelling prediction uncertainty is also desirable to better inform the resource decision-making process, but research in this field is under-investigated. In this paper, we propose univariate and bivariate Bayesian deep learning models that provide predictions of future workload demand and its uncertainty. We run extensive experiments on Google and Alibaba clusters, where we first train our models with datasets from different cloud providers and compare them with LSTM-based baselines. Results show that modelling the uncertainty of predictions has a positive impact on performance, especially on service level metrics, because uncertainty quantification can be tailored to desired target service levels that are critical in cloud applications. Moreover, we investigate whether our models benefit transfer learning capabilities across different domains, i.e. dataset distributions. Experiments on the same workload datasets reveal that acceptable transfer learning performance can be achieved within the same provider (because distributions are more similar). Also, domain knowledge does not transfer when the source and target domains are very different (e.g. from different providers), but this performance degradation can be mitigated by increasing the training set size of the source domain.

AIApr 29, 2022
SATfeatPy -- A Python-based Feature Extraction System for Satisfiability

Benjamin Provan-Bessell, Marco Dalla, Andrea Visentin et al.

Feature extraction is a fundamental task in the application of machine learning methods to SAT solving. It is used in algorithm selection and configuration for solver portfolios and satisfiability classification. Many approaches have been proposed to extract meaningful attributes from CNF instances. Most of them lack a working/updated implementation, and the limited descriptions lack clarity affecting the reproducibility. Furthermore, the literature misses a comparison among the features. This paper introduces SATfeatPy, a library that offers feature extraction techniques for SAT problems in the CNF form. This package offers the implementation of all the structural and statistical features from there major papers in the field. The library is provided in an up-to-date, easy-to-use Python package alongside a detailed feature description. We show the high accuracy of SAT/UNSAT and problem category classification, using five sets of features generated using our library from a dataset of 3000 SAT and UNSAT instances, over ten different classes of problems. Finally, we compare the usefulness of the features and importance for predicting a SAT instance's original structure in an ablation study.

LGMar 19
SHAPCA: Consistent and Interpretable Explanations for Machine Learning Models on Spectroscopy Data

Mingxing Zhang, Nicola Rossberg, Simone Innocente et al.

In recent years, machine learning models have been increasingly applied to spectroscopic datasets for chemical and biomedical analysis. For their successful adoption, particularly in clinical and safety-critical settings, professionals and researchers must be able to understand and trust the reasoning behind model predictions. However, the inherently high dimensionality and strong collinearity of spectroscopy data pose a fundamental challenge to model explainability. These properties not only complicate model training but also undermine the stability and consistency of explanations, leading to fluctuations in feature importance across repeated training runs. Feature extraction techniques have been used to reduce the input dimensionality; these new features hinder the connection between the prediction and the original signal. This study proposes SHAPCA, an explainable machine learning pipeline that combines Principal Component Analysis (for dimensionality reduction) and Shapely Additive exPlanations (for post hoc explanation) to provide explanations in the original input space, which a practitioner can interpret and link back to the biological components. The proposed framework enables analysis from both global and local perspectives, revealing the spectral bands that drive overall model behaviour as well as the instance-specific features that influence individual predictions. Numerical analysis demonstrated the interpretability of the results and greater consistency across different runs.

LGFeb 9, 2024
Electricity Price Forecasting in the Irish Balancing Market

Ciaran O'Connor, Joseph Collins, Steven Prestwich et al.

Short-term electricity markets are becoming more relevant due to less-predictable renewable energy sources, attracting considerable attention from the industry. The balancing market is the closest to real-time and the most volatile among them. Its price forecasting literature is limited, inconsistent and outdated, with few deep learning attempts and no public dataset. This work applies to the Irish balancing market a variety of price prediction techniques proven successful in the widely studied day-ahead market. We compare statistical, machine learning, and deep learning models using a framework that investigates the impact of different training sizes. The framework defines hyperparameters and calibration settings; the dataset and models are made public to ensure reproducibility and to be used as benchmarks for future works. An extensive numerical study shows that well-performing models in the day-ahead market do not perform well in the balancing one, highlighting that these markets are fundamentally different constructs. The best model is LEAR, a statistical approach based on LASSO, which outperforms more complex and computationally demanding approaches.

LGApr 29
Interpretable EEG Microstate Discovery via Variational Deep Embedding: A Systematic Architecture Search with Multi-Quadrant Evaluation

Saheed Faremi, Andrea Visentin, Luca Longo

EEG microstate analysis segments continuous brain electrical activity into brief, quasi-stable topographic configurations that reflect discrete functional brain states. Conventional approaches such as Modified K-Means operate directly in electrode space with hard assignment, offering no learned latent representation, no generative decoder, and no mechanism to decode latent configurations into verifiable scalp topographies, limiting both model transparency and interpretability. To address this, we present a Convolutional Variational Deep Embedding (Conv-VaDE) model that jointly learns topographic reconstruction and probabilistic soft clustering in a shared latent space. Conv-VaDE enables generative decoding of cluster prototypes into verifiable scalp topographies, replacing opaque hard partitioning with probabilistic soft assignment. A polarity invariance scheme and a four-dimensional grid search over cluster count (K from 3 to 20), latent dimensionality, network depth, and channel width are conducted to systematically reveal how each architectural design choice shapes the quality, stability, and interpretability of learned EEG microstate representations. The model is evaluated on the LEMON resting-state eyes-closed EEG dataset with ten participants using topographic template formation, clustering stability, and global explained variance (GEV). The architecture search reveals that depth L = 4 appears consistently across all 18 best-performing configurations, yielding a best-case GEV of 0.730 and a silhouette of 0.229 at K = 4 across the model sweeps, where moderately deep networks with compact channel widths and small latent dimensionality dominate across the full K range. These results establish that principled architecture search, rather than model scale, is the key to interpretable and stable EEG microstate discovery via variational deep embedding.

LGFeb 7, 2025
Conformal Prediction for Electricity Price Forecasting in the Day-Ahead and Real-Time Balancing Market

Ciaran O'Connor, Mohamed Bahloul, Roberto Rossi et al.

The integration of renewable energy into electricity markets poses significant challenges to price stability and increases the complexity of market operations. Accurate and reliable electricity price forecasting is crucial for effective market participation, where price dynamics can be significantly more challenging to predict. Probabilistic forecasting, through prediction intervals, efficiently quantifies the inherent uncertainties in electricity prices, supporting better decision-making for market participants. This study explores the enhancement of probabilistic price prediction using Conformal Prediction (CP) techniques, specifically Ensemble Batch Prediction Intervals and Sequential Predictive Conformal Inference. These methods provide precise and reliable prediction intervals, outperforming traditional models in validity metrics. We propose an ensemble approach that combines the efficiency of quantile regression models with the robust coverage properties of time series adapted CP techniques. This ensemble delivers both narrow prediction intervals and high coverage, leading to more reliable and accurate forecasts. We further evaluate the practical implications of CP techniques through a simulated trading algorithm applied to a battery storage system. The ensemble approach demonstrates improved financial returns in energy trading in both the Day-Ahead and Balancing Markets, highlighting its practical benefits for market participants.

IVMar 3, 2025
Machine Learning Applications to Diffuse Reflectance Spectroscopy in Optical Diagnosis; A Systematic Review

Nicola Rossberg, Celina L. Li, Simone Innocente et al.

Diffuse Reflectance Spectroscopy has demonstrated a strong aptitude for identifying and differentiating biological tissues. However, the broadband and smooth nature of these signals require algorithmic processing, as they are often difficult for the human eye to distinguish. The implementation of machine learning models for this task has demonstrated high levels of diagnostic accuracies and led to a wide range of proposed methodologies for applications in various illnesses and conditions. In this systematic review, we summarise the state of the art of these applications, highlight current gaps in research and identify future directions. This review was conducted in accordance with the PRISMA guidelines. 77 studies were retrieved and in-depth analysis was conducted. It is concluded that diffuse reflectance spectroscopy and machine learning have strong potential for tissue differentiation in clinical applications, but more rigorous sample stratification in tandem with in-vivo validation and explainable algorithm development is required going forward.

HCApr 7
Improving Explanations: Applying the Feature Understandability Scale for Cost-Sensitive Feature Selection

Nicola Rossberg, Bennett Kleinberg, Barry O'Sullivan et al.

With the growing pervasiveness of artificial intelligence, the ability to explain the inferences made by machine learning models has become increasingly important. Numerous techniques for model explainability have been proposed, with natural-language textual explanations among the most widely used approaches. When applied to tabular data, these explanations typically draw on input features to justify a given inference. Consequently, a user's ability to interpret the explanation depends on their understanding of the input features. To quantify this feature-level understanding, Rossberg et al. introduced the Feature Understandability Scale. Building on that work, this proof-of-concept study collects understandability scores across two datasets, proposes a co-optimisation methodology of understandability and accuracy and presents the resulting explanations alongside the model accuracies. This work contributes to the body of knowledge on model interpretability by design. It is found that accuracy and understandability can be successfully co-optimised while maintaining high classification performances. The resulting explanations are considered more understandable at face value. Further research will aim to confirm these findings through user evaluation.

STOct 28, 2025
The Evolution of Probabilistic Price Forecasting Techniques: A Review of the Day-Ahead, Intra-Day, and Balancing Markets

Ciaran O'Connor, Mohamed Bahloul, Steven Prestwich et al.

Electricity price forecasting has become a critical tool for decision-making in energy markets, particularly as the increasing penetration of renewable energy introduces greater volatility and uncertainty. Historically, research in this field has been dominated by point forecasting methods, which provide single-value predictions but fail to quantify uncertainty. However, as power markets evolve due to renewable integration, smart grids, and regulatory changes, the need for probabilistic forecasting has become more pronounced, offering a more comprehensive approach to risk assessment and market participation. This paper presents a review of probabilistic forecasting methods, tracing their evolution from Bayesian and distribution based approaches, through quantile regression techniques, to recent developments in conformal prediction. Particular emphasis is placed on advancements in probabilistic forecasting, including validity-focused methods which address key limitations in uncertainty estimation. Additionally, this review extends beyond the Day-Ahead Market to include the Intra-Day and Balancing Markets, where forecasting challenges are intensified by higher temporal granularity and real-time operational constraints. We examine state of the art methodologies, key evaluation metrics, and ongoing challenges, such as forecast validity, model selection, and the absence of standardised benchmarks, providing researchers and practitioners with a comprehensive and timely resource for navigating the complexities of modern electricity markets.

CVApr 9, 2025
Zero-Shot Image-Based Large Language Model Approach to Road Pavement Monitoring

Shuoshuo Xu, Kai Zhao, James Loney et al.

Effective and rapid evaluation of pavement surface condition is critical for prioritizing maintenance, ensuring transportation safety, and minimizing vehicle wear and tear. While conventional manual inspections suffer from subjectivity, existing machine learning-based methods are constrained by their reliance on large and high-quality labeled datasets, which require significant resources and limit adaptability across varied road conditions. The revolutionary advancements in Large Language Models (LLMs) present significant potential for overcoming these challenges. In this study, we propose an innovative automated zero-shot learning approach that leverages the image recognition and natural language understanding capabilities of LLMs to assess road conditions effectively. Multiple LLM-based assessment models were developed, employing prompt engineering strategies aligned with the Pavement Surface Condition Index (PSCI) standards. These models' accuracy and reliability were evaluated against official PSCI results, with an optimized model ultimately selected. Extensive tests benchmarked the optimized model against evaluations from various levels experts using Google Street View road images. The results reveal that the LLM-based approach can effectively assess road conditions, with the optimized model -employing comprehensive and structured prompt engineering strategies -outperforming simpler configurations by achieving high accuracy and consistency, even surpassing expert evaluations. Moreover, successfully applying the optimized model to Google Street View images demonstrates its potential for future city-scale deployments. These findings highlight the transformative potential of LLMs in automating road damage evaluations and underscore the pivotal role of detailed prompt engineering in achieving reliable assessments.

LGJun 19, 2024
Optimizing Quantile-based Trading Strategies in Electricity Arbitrage

Ciaran O'Connor, Joseph Collins, Steven Prestwich et al.

Efficiently integrating renewable resources into electricity markets is vital for addressing the challenges of matching real-time supply and demand while reducing the significant energy wastage resulting from curtailments. To address this challenge effectively, the incorporation of storage devices can enhance the reliability and efficiency of the grid, improving market liquidity and reducing price volatility. In short-term electricity markets, participants navigate numerous options, each presenting unique challenges and opportunities, underscoring the critical role of the trading strategy in maximizing profits. This study delves into the optimization of day-ahead and balancing market trading, leveraging quantile-based forecasts. Employing three trading approaches with practical constraints, our research enhances forecast assessment, increases trading frequency, and employs flexible timestamp orders. Our findings underscore the profit potential of simultaneous participation in both day-ahead and balancing markets, especially with larger battery storage systems; despite increased costs and narrower profit margins associated with higher-volume trading, the implementation of high-frequency strategies plays a significant role in maximizing profits and addressing market challenges. Finally, we modelled four commercial battery storage systems and evaluated their economic viability through a scenario analysis, with larger batteries showing a shorter return on investment.