Francesco Guarnera

CV
h-index43
10papers
30citations
Novelty49%
AI Score54

10 Papers

CVFeb 4Code
SynthForensics: A Multi-Generator Benchmark for Detecting Synthetic Video Deepfakes

Roberto Leotta, Salvatore Alfio Sambataro, Claudio Vittorio Ragaglia et al.

The landscape of synthetic media has been irrevocably altered by text-to-video (T2V) models, whose outputs are rapidly approaching indistinguishability from reality. Critically, this technology is no longer confined to large-scale labs; the proliferation of efficient, open-source generators is democratizing the ability to create high-fidelity synthetic content on consumer-grade hardware. This makes existing face-centric and manipulation-based benchmarks obsolete. To address this urgent threat, we introduce SynthForensics, to the best of our knowledge the first human-centric benchmark for detecting purely synthetic video deepfakes. The benchmark comprises 6,815 unique videos from five architecturally distinct, state-of-the-art open-source T2V models. Its construction was underpinned by a meticulous two-stage, human-in-the-loop validation to ensure high semantic and visual quality. Each video is provided in four versions (raw, lossless, light, and heavy compression) to enable real-world robustness testing. Experiments demonstrate that state-of-the-art detectors are both fragile and exhibit limited generalization when evaluated on this new domain: we observe a mean performance drop of $29.19\%$ AUC, with some methods performing worse than random chance, and top models losing over 30 points under heavy compression. The paper further investigates the efficacy of training on SynthForensics as a means to mitigate these observed performance gaps, achieving robust generalization to unseen generators ($93.81\%$ AUC), though at the cost of reduced backward compatibility with traditional manipulation-based deepfakes. The complete dataset and all generation metadata, including the specific prompts and inference parameters for every video, will be made publicly available at [link anonymized for review].

CVJul 19, 2025Code
Benchmarking GANs, Diffusion Models, and Flow Matching for T1w-to-T2w MRI Translation

Andrea Moschetto, Lemuel Puglisi, Alec Sargood et al.

Magnetic Resonance Imaging (MRI) enables the acquisition of multiple image contrasts, such as T1-weighted (T1w) and T2-weighted (T2w) scans, each offering distinct diagnostic insights. However, acquiring all desired modalities increases scan time and cost, motivating research into computational methods for cross-modal synthesis. To address this, recent approaches aim to synthesize missing MRI contrasts from those already acquired, reducing acquisition time while preserving diagnostic quality. Image-to-image (I2I) translation provides a promising framework for this task. In this paper, we present a comprehensive benchmark of generative models$\unicode{x2013}$specifically, Generative Adversarial Networks (GANs), diffusion models, and flow matching (FM) techniques$\unicode{x2013}$for T1w-to-T2w 2D MRI I2I translation. All frameworks are implemented with comparable settings and evaluated on three publicly available MRI datasets of healthy adults. Our quantitative and qualitative analyses show that the GAN-based Pix2Pix model outperforms diffusion and FM-based methods in terms of structural fidelity, image quality, and computational efficiency. Consistent with existing literature, these results suggest that flow-based models are prone to overfitting on small datasets and simpler tasks, and may require more data to match or surpass GAN performance. These findings offer practical guidance for deploying I2I translation techniques in real-world MRI workflows and highlight promising directions for future research in cross-modal medical image synthesis. Code and models are publicly available at https://github.com/AndreaMoschetto/medical-I2I-benchmark.

53.4IVMar 13Code
Reinforcing the Weakest Links: Modernizing SIENA with Targeted Deep Learning Integration

Riccardo Raciti, Lemuel Puglisi, Francesco Guarnera et al.

Percentage Brain Volume Change (PBVC) derived from Magnetic Resonance Imaging (MRI) is a widely used biomarker of brain atrophy, with SIENA among the most established methods for its estimation. However, SIENA relies on classical image processing steps, particularly skull stripping and tissue segmentation, whose failures can propagate through the pipeline and bias atrophy estimates. In this work, we examine whether targeted deep learning substitutions can improve SIENA while preserving its established and interpretable framework. To this end, we integrate SynthStrip and SynthSeg into SIENA and evaluate three pipeline variants on the ADNI and PPMI longitudinal cohorts. Performance is assessed using three complementary criteria: correlation with longitudinal clinical and structural decline, scan-order consistency, and end-to-end runtime. Replacing the skull-stripping module yields the most consistent gains: in ADNI, it substantially strengthens associations between PBVC and multiple measures of disease progression relative to the standard SIENA pipeline, while across both datasets it markedly improves robustness under scan reversal. The fully integrated pipeline achieves the strongest scan-order consistency, reducing the error by up to 99.1%. In addition, GPU-enabled variants reduce execution time by up to 46% while maintaining CPU runtimes comparable to standard SIENA. Overall, these findings show that deep learning can meaningfully strengthen established longitudinal atrophy pipelines when used to reinforce their weakest image processing steps. More broadly, this study highlights the value of modularly modernizing clinically trusted neuroimaging tools without sacrificing their interpretability. Code is publicly available at https://github.com/Raciti/Enhanced-SIENA.git.

CVSep 20, 2025Code
A Novel Metric for Detecting Memorization in Generative Models for Brain MRI Synthesis

Antonio Scardace, Lemuel Puglisi, Francesco Guarnera et al.

Deep generative models have emerged as a transformative tool in medical imaging, offering substantial potential for synthetic data generation. However, recent empirical studies highlight a critical vulnerability: these models can memorize sensitive training data, posing significant risks of unauthorized patient information disclosure. Detecting memorization in generative models remains particularly challenging, necessitating scalable methods capable of identifying training data leakage across large sets of generated samples. In this work, we propose DeepSSIM, a novel self-supervised metric for quantifying memorization in generative models. DeepSSIM is trained to: i) project images into a learned embedding space and ii) force the cosine similarity between embeddings to match the ground-truth SSIM (Structural Similarity Index) scores computed in the image space. To capture domain-specific anatomical features, training incorporates structure-preserving augmentations, allowing DeepSSIM to estimate similarity reliably without requiring precise spatial alignment. We evaluate DeepSSIM in a case study involving synthetic brain MRI data generated by a Latent Diffusion Model (LDM) trained under memorization-prone conditions, using 2,195 MRI scans from two publicly available datasets (IXI and CoRR). Compared to state-of-the-art memorization metrics, DeepSSIM achieves superior performance, improving F1 scores by an average of +52.03% over the best existing method. Code and data of our approach are publicly available at the following link: https://github.com/brAIn-science/DeepSSIM.

IVJun 1, 2024Code
SynthBA: Reliable Brain Age Estimation Across Multiple MRI Sequences and Resolutions

Lemuel Puglisi, Alessia Rondinella, Linda De Meo et al.

Brain age is a critical measure that reflects the biological ageing process of the brain. The gap between brain age and chronological age, referred to as brain PAD (Predicted Age Difference), has been utilized to investigate neurodegenerative conditions. Brain age can be predicted using MRIs and machine learning techniques. However, existing methods are often sensitive to acquisition-related variabilities, such as differences in acquisition protocols, scanners, MRI sequences, and resolutions, significantly limiting their application in highly heterogeneous clinical settings. In this study, we introduce Synthetic Brain Age (SynthBA), a robust deep-learning model designed for predicting brain age. SynthBA utilizes an advanced domain randomization technique, ensuring effective operation across a wide array of acquisition-related variabilities. To assess the effectiveness and robustness of SynthBA, we evaluate its predictive capabilities on internal and external datasets, encompassing various MRI sequences and resolutions, and compare it with state-of-the-art techniques. Additionally, we calculate the brain PAD in a large cohort of subjects with Alzheimer's Disease (AD), demonstrating a significant correlation with AD-related measures of cognitive dysfunction. SynthBA holds the potential to facilitate the broader adoption of brain age prediction in clinical settings, where re-training or fine-tuning is often unfeasible. The SynthBA source code and pre-trained models are publicly available at https://github.com/LemuelPuglisi/SynthBA.

CVMar 21, 2024
On the exploitation of DCT statistics for cropping detectors

Claudio Vittorio Ragaglia, Francesco Guarnera, Sebastiano Battiato

{The study of frequency components derived from Discrete Cosine Transform (DCT) has been widely used in image analysis. In recent years it has been observed that significant information can be extrapolated from them about the lifecycle of the image, but no study has focused on the analysis between them and the source resolution of the image. In this work, we investigated a novel image resolution classifier that employs DCT statistics with the goal to detect the original resolution of images; in particular the insight was exploited to address the challenge of identifying cropped images. Training a Machine Learning (ML) classifier on entire images (not cropped), the generated model can leverage this information to detect cropping. The results demonstrate the classifier's reliability in distinguishing between cropped and not cropped images, providing a dependable estimation of their original resolution. This advancement has significant implications for image processing applications, including digital security, authenticity verification, and visual quality analysis, by offering a new tool for detecting image manipulations and enhancing qualitative image assessment. This work opens new perspectives in the field, with potential to transform image analysis and usage across multiple domains.}

CVSep 3, 2025
Temporally-Aware Diffusion Model for Brain Progression Modelling with Bidirectional Temporal Regularisation

Mattia Litrico, Francesco Guarnera, Mario Valerio Giuffrida et al.

Generating realistic MRIs to accurately predict future changes in the structure of brain is an invaluable tool for clinicians in assessing clinical outcomes and analysing the disease progression at the patient level. However, current existing methods present some limitations: (i) some approaches fail to explicitly capture the relationship between structural changes and time intervals, especially when trained on age-imbalanced datasets; (ii) others rely only on scan interpolation, which lack clinical utility, as they generate intermediate images between timepoints rather than future pathological progression; and (iii) most approaches rely on 2D slice-based architectures, thereby disregarding full 3D anatomical context, which is essential for accurate longitudinal predictions. We propose a 3D Temporally-Aware Diffusion Model (TADM-3D), which accurately predicts brain progression on MRI volumes. To better model the relationship between time interval and brain changes, TADM-3D uses a pre-trained Brain-Age Estimator (BAE) that guides the diffusion model in the generation of MRIs that accurately reflect the expected age difference between baseline and generated follow-up scans. Additionally, to further improve the temporal awareness of TADM-3D, we propose the Back-In-Time Regularisation (BITR), by training TADM-3D to predict bidirectionally from the baseline to follow-up (forward), as well as from the follow-up to baseline (backward). Although predicting past scans has limited clinical applications, this regularisation helps the model generate temporally more accurate scans. We train and evaluate TADM-3D on the OASIS-3 dataset, and we validate the generalisation performance on an external test set from the NACC dataset. The code will be available upon acceptance.

IVJun 18, 2024
TADM: Temporally-Aware Diffusion Model for Neurodegenerative Progression on Brain MRI

Mattia Litrico, Francesco Guarnera, Valerio Giuffirda et al.

Generating realistic images to accurately predict changes in the structure of brain MRI is a crucial tool for clinicians. Such applications help assess patients' outcomes and analyze how diseases progress at the individual level. However, existing methods for this task present some limitations. Some approaches attempt to model the distribution of MRI scans directly by conditioning the model on patients' ages, but they fail to explicitly capture the relationship between structural changes in the brain and time intervals, especially on age-unbalanced datasets. Other approaches simply rely on interpolation between scans, which limits their clinical application as they do not predict future MRIs. To address these challenges, we propose a Temporally-Aware Diffusion Model (TADM), which introduces a novel approach to accurately infer progression in brain MRIs. TADM learns the distribution of structural changes in terms of intensity differences between scans and combines the prediction of these changes with the initial baseline scans to generate future MRIs. Furthermore, during training, we propose to leverage a pre-trained Brain-Age Estimator (BAE) to refine the model's training process, enhancing its ability to produce accurate MRIs that match the expected age gap between baseline and generated scans. Our assessment, conducted on the OASIS-3 dataset, uses similarity metrics and region sizes computed by comparing predicted and real follow-up scans on 3 relevant brain regions. TADM achieves large improvements over existing approaches, with an average decrease of 24% in region size error and an improvement of 4% in similarity metrics. These evaluations demonstrate the improvement of our model in mimicking temporal brain neurodegenerative progression compared to existing methods. Our approach will benefit applications, such as predicting patient outcomes or improving treatments for patients.

IVAug 7, 2020
In-Depth DCT Coefficient Distribution Analysis for First Quantization Estimation

Sebastiano Battiato, Oliver Giudice, Francesco Guarnera et al.

The exploitation of traces in JPEG double compressed images is of utter importance for investigations. Properly exploiting such insights, First Quantization Estimation (FQE) could be performed in order to obtain source camera model identification (CMI) and therefore reconstruct the history of a digital image. In this paper, a method able to estimate the first quantization factors for JPEG double compressed images is presented, employing a mixed statistical and Machine Learning approach. The presented solution is demonstrated to work without any a-priori assumptions about the quantization matrices. Experimental results and comparisons with the state-of-the-art show the goodness of the proposed technique.

CVJul 9, 2020
Animated GIF optimization by adaptive color local table management

Oliver Giudice, Dario Allegra, Francesco Guarnera et al.

After thirty years of the GIF file format, today is becoming more popular than ever: being a great way of communication for friends and communities on Instant Messengers and Social Networks. While being so popular, the original compression method to encode GIF images have not changed a bit. On the other hand popularity means that storage saving becomes an issue for hosting platforms. In this paper a parametric optimization technique for animated GIFs will be presented. The proposed technique is based on Local Color Table selection and color remapping in order to create optimized animated GIFs while preserving the original format. The technique achieves good results in terms of byte reduction with limited or no loss of perceived color quality. Tests carried out on 1000 GIF files demonstrate the effectiveness of the proposed optimization strategy.