Luca Gherardini

LG
h-index29
3papers
5citations
Novelty42%
AI Score37

3 Papers

LGAug 23, 2023
CACTUS: a Comprehensive Abstraction and Classification Tool for Uncovering Structures

Luca Gherardini, Varun Ravi Varma, Karol Capala et al.

The availability of large data sets is providing an impetus for driving current artificial intelligent developments. There are, however, challenges for developing solutions with small data sets due to practical and cost-effective deployment and the opacity of deep learning models. The Comprehensive Abstraction and Classification Tool for Uncovering Structures called CACTUS is presented for improved secure analytics by effectively employing explainable artificial intelligence. It provides additional support for categorical attributes, preserving their original meaning, optimising memory usage, and speeding up the computation through parallelisation. It shows to the user the frequency of the attributes in each class and ranks them by their discriminative power. Its performance is assessed by application to the Wisconsin diagnostic breast cancer and Thyroid0387 data sets.

LGFeb 19
A feature-stable and explainable machine learning framework for trustworthy decision-making under incomplete clinical data

Justyna Andrys-Olek, Paulina Tworek, Luca Gherardini et al.

Machine learning models are increasingly applied to biomedical data, yet their adoption in high stakes domains remains limited by poor robustness, limited interpretability, and instability of learned features under realistic data perturbations, such as missingness. In particular, models that achieve high predictive performance may still fail to inspire trust if their key features fluctuate when data completeness changes, undermining reproducibility and downstream decision-making. Here, we present CACTUS (Comprehensive Abstraction and Classification Tool for Uncovering Structures), an explainable machine learning framework explicitly designed to address these challenges in small, heterogeneous, and incomplete clinical datasets. CACTUS integrates feature abstraction, interpretable classification, and systematic feature stability analysis to quantify how consistently informative features are preserved as data quality degrades. Using a real-world haematuria cohort comprising 568 patients evaluated for bladder cancer, we benchmark CACTUS against widely used machine learning approaches, including random forests and gradient boosting methods, under controlled levels of randomly introduced missing data. We demonstrate that CACTUS achieves competitive or superior predictive performance while maintaining markedly higher stability of top-ranked features as missingness increases, including in sex-stratified analyses. Our results show that feature stability provides information complementary to conventional performance metrics and is essential for assessing the trustworthiness of machine learning models applied to biomedical data. By explicitly quantifying robustness to missing data and prioritising interpretable, stable features, CACTUS offers a generalizable framework for trustworthy data-driven decision support.

LGJun 16, 2025
CACTUS as a Reliable Tool for Early Classification of Age-related Macular Degeneration

Luca Gherardini, Imre Lengyel, Tunde Peto et al.

Machine Learning (ML) is used to tackle various tasks, such as disease classification and prediction. The effectiveness of ML models relies heavily on having large amounts of complete data. However, healthcare data is often limited or incomplete, which can hinder model performance. Additionally, issues like the trustworthiness of solutions vary with the datasets used. The lack of transparency in some ML models further complicates their understanding and use. In healthcare, particularly in the case of Age-related Macular Degeneration (AMD), which affects millions of older adults, early diagnosis is crucial due to the absence of effective treatments for reversing progression. Diagnosing AMD involves assessing retinal images along with patients' symptom reports. There is a need for classification approaches that consider genetic, dietary, clinical, and demographic factors. Recently, we introduced the -Comprehensive Abstraction and Classification Tool for Uncovering Structures-(CACTUS), aimed at improving AMD stage classification. CACTUS offers explainability and flexibility, outperforming standard ML models. It enhances decision-making by identifying key factors and providing confidence in its results. The important features identified by CACTUS allow us to compare with existing medical knowledge. By eliminating less relevant or biased data, we created a clinical scenario for clinicians to offer feedback and address biases.