CVDec 30, 2022
Image Embedding for Denoising Generative ModelsAndrea Asperti, Davide Evangelista, Samuele Marro et al.
Denoising Diffusion models are gaining increasing popularity in the field of generative modeling for several reasons, including the simple and stable training, the excellent generative quality, and the solid probabilistic foundation. In this article, we address the problem of {\em embedding} an image into the latent space of Denoising Diffusion Models, that is finding a suitable ``noisy'' image whose denoising results in the original image. We particularly focus on Denoising Diffusion Implicit Models due to the deterministic nature of their reverse diffusion process. As a side result of our investigation, we gain a deeper insight into the structure of the latent space of diffusion models, opening interesting perspectives on its exploration, the definition of semantic trajectories, and the manipulation/conditioning of encodings for editing purposes. A particularly interesting property highlighted by our research, which is also characteristic of this class of generative models, is the independence of the latent representation from the networks implementing the reverse diffusion process. In other words, a common seed passed to different networks (each trained on the same dataset), eventually results in identical images.
NANov 24, 2022
To be or not to be stable, that is the question: understanding neural networks for inverse problemsDavide Evangelista, James Nagy, Elena Morotti et al.
The solution of linear inverse problems arising, for example, in signal and image processing is a challenging problem since the ill-conditioning amplifies, in the solution, the noise present in the data. Recently introduced algorithms based on deep learning overwhelm the more traditional model-based approaches in performance, but they typically suffer from instability with respect to data perturbation. In this paper, we theoretically analyze the trade-off between stability and accuracy of neural networks, when used to solve linear imaging inverse problems for not under-determined cases. Moreover, we propose different supervised and unsupervised solutions to increase the network stability and maintain a good accuracy, by means of regularization properties inherited from a model-based iterative scheme during the network training and pre-processing stabilizing operator in the neural networks. Extensive numerical experiments on image deblurring confirm the theoretical results and the effectiveness of the proposed deep learning-based approaches to handle noise on the data.
CVFeb 11
A Diffusion-Based Generative Prior Approach to Sparse-view Computed TomographyDavide Evangelista, Pasquale Cascarano, Elena Loli Piccolomini
The reconstruction of X-rays CT images from sparse or limited-angle geometries is a highly challenging task. The lack of data typically results in artifacts in the reconstructed image and may even lead to object distortions. For this reason, the use of deep generative models in this context has great interest and potential success. In the Deep Generative Prior (DGP) framework, the use of diffusion-based generative models is combined with an iterative optimization algorithm for the reconstruction of CT images from sinograms acquired under sparse geometries, to maintain the explainability of a model-based approach while introducing the generative power of a neural network. There are therefore several aspects that can be further investigated within these frameworks to improve reconstruction quality, such as image generation, the model, and the iterative algorithm used to solve the minimization problem, for which we propose modifications with respect to existing approaches. The results obtained even under highly sparse geometries are very promising, although further research is clearly needed in this direction.
CVJul 15, 2024
LIP-CAR: contrast agent reduction by a deep learned inverse problemDavide Bianchi, Sonia Colombo Serra, Davide Evangelista et al.
The adoption of contrast agents in medical imaging protocols is crucial for accurate and timely diagnosis. While highly effective and characterized by an excellent safety profile, the use of contrast agents has its limitation, including rare risk of allergic reactions, potential environmental impact and economic burdens on patients and healthcare systems. In this work, we address the contrast agent reduction (CAR) problem, which involves reducing the administered dosage of contrast agent while preserving the visual enhancement. The current literature on the CAR task is based on deep learning techniques within a fully image processing framework. These techniques digitally simulate high-dose images from images acquired with a low dose of contrast agent. We investigate the feasibility of a ``learned inverse problem'' (LIP) approach, as opposed to the end-to-end paradigm in the state-of-the-art literature. Specifically, we learn the image-to-image operator that maps high-dose images to their corresponding low-dose counterparts, and we frame the CAR task as an inverse problem. We then solve this problem through a regularized optimization reformulation. Regularization methods are well-established mathematical techniques that offer robustness and explainability. Our approach combines these rigorous techniques with cutting-edge deep learning tools. Numerical experiments performed on pre-clinical medical images confirm the effectiveness of this strategy, showing improved stability and accuracy in the simulated high-dose images.
12.4CVMay 12
Improving Diffusion Posterior Samplers with Lagged Temporal Corrections for Image RestorationDavide Evangelista, Elena Morotti, Francesco Pivi et al.
Diffusion-based posterior sampling (PS) is a leading framework for imaging inverse problems, combining learned priors with measurement constraints. Yet, its standard formulations rely on instantaneous data-consistent estimates, which induce temporal variability in the reverse dynamics. We reinterpret PS from a dynamical perspective, showing that the standard PS update corresponds to a first-order discretization of the diffusion dynamics plus a residual correction capturing the mismatch between the denoised prediction and the data-consistent estimate. A second-order discretization, however, naturally introduces a temporal correction based on the variation of consecutive estimates. Building on this, we propose LAMP, combining the second-order update with the residual correction characterizing a PS technique. LAMP thus inherits a lagged temporal correction, and it can be implemented as a modular plug-in over the PS backbone. We show that LAMP preserves the structure of a posterior sampler, and we perform a one-step risk analysis to characterize when LAMP improves the reverse transition via a bias-variance trade-off. Experiments across multiple imaging tasks demonstrate consistent improvements over strong baselines such as DiffPIR and DDRM, without increasing the number of denoising evaluations.
15.1LGMar 18
CLeAN: Continual Learning Adaptive Normalization in Dynamic EnvironmentsIsabella Marasco, Davide Evangelista, Elena Loli Piccolomini et al.
Artificial intelligence systems predominantly rely on static data distributions, making them ineffective in dynamic real-world environments, such as cybersecurity, autonomous transportation, or finance, where data shifts frequently. Continual learning offers a potential solution by enabling models to learn from sequential data while retaining prior knowledge. However, a critical and underexplored issue in this domain is data normalization. Conventional normalization methods, such as min-max scaling, presuppose access to the entire dataset, which is incongruent with the sequential nature of continual learning. In this paper we introduce Continual Learning Adaptive Normalization (CLeAN), a novel adaptive normalization technique designed for continual learning in tabular data. CLeAN involves the estimation of global feature scales using learnable parameters that are updated via an Exponential Moving Average (EMA) module, enabling the model to adapt to evolving data distributions. Through comprehensive evaluations on two datasets and various continual learning strategies, including Resevoir Experience Replay, A-GEM, and EwC we demonstrate that CLeAN not only improves model performance on new data but also mitigates catastrophic forgetting. The findings underscore the importance of adaptive normalization in enhancing the stability and effectiveness of tabular data, offering a novel perspective on the use of normalization to preserve knowledge in dynamic learning environments.
CLApr 4, 2025
Language Models Are Implicitly ContinuousSamuele Marro, Davide Evangelista, X. Angelo Huang et al. · oxford
Language is typically modelled with discrete sequences. However, the most successful approaches to language modelling, namely neural networks, are continuous and smooth function approximators. In this work, we show that Transformer-based language models implicitly learn to represent sentences as continuous-time functions defined over a continuous input space. This phenomenon occurs in most state-of-the-art Large Language Models (LLMs), including Llama2, Llama3, Phi3, Gemma, Gemma2, and Mistral, and suggests that LLMs reason about language in ways that fundamentally differ from humans. Our work formally extends Transformers to capture the nuances of time and space continuity in both input and output space. Our results challenge the traditional interpretation of how LLMs understand language, with several linguistic and engineering implications.
IVApr 25, 2024
Space-Variant Total Variation boosted by learning techniques in few-view tomographic imagingElena Morotti, Davide Evangelista, Andrea Sebastiani et al.
This paper focuses on the development of a space-variant regularization model for solving an under-determined linear inverse problem. The case study is a medical image reconstruction from few-view tomographic noisy data. The primary objective of the proposed optimization model is to achieve a good balance between denoising and the preservation of fine details and edges, overcoming the performance of the popular and largely used Total Variation (TV) regularization through the application of appropriate pixel-dependent weights. The proposed strategy leverages the role of gradient approximations for the computation of the space-variant TV weights. For this reason, a convolutional neural network is designed, to approximate both the ground truth image and its gradient using an elastic loss function in its training. Additionally, the paper provides a theoretical analysis of the proposed model, showing the uniqueness of its solution, and illustrates a Chambolle-Pock algorithm tailored to address the specific problem at hand. This comprehensive framework integrates innovative regularization techniques with advanced neural network capabilities, demonstrating promising results in achieving high-quality reconstructions from low-sampled tomographic data.
APJan 21, 2025
Controlling Ensemble Variance in Diffusion Models: An Application for Reanalyses DownscalingFabio Merizzi, Davide Evangelista, Harilaos Loukos
In recent years, diffusion models have emerged as powerful tools for generating ensemble members in meteorology. In this work, we demonstrate that a Denoising Diffusion Implicit Model (DDIM) can effectively control ensemble variance by varying the number of diffusion steps. Introducing a theoretical framework, we relate diffusion steps to the variance expressed by the reverse diffusion process. Focusing on reanalysis downscaling, we propose an ensemble diffusion model for the full ERA5-to-CERRA domain, generating variance-calibrated ensemble members for wind speed at full spatial and temporal resolution. Our method aligns global mean variance with a reference ensemble dataset and ensures spatial variance is distributed in accordance with observed meteorological variability. Additionally, we address the lack of ensemble information in the CARRA dataset, showcasing the utility of our approach for efficient, high-resolution ensemble generation.
NADec 2, 2024
Deep Guess acceleration for explainable image reconstruction in sparse-view CTElena Loli Piccolomini, Davide Evangelista, Elena Morotti
Sparse-view Computed Tomography (CT) is an emerging protocol designed to reduce X-ray dose radiation in medical imaging. Traditional Filtered Back Projection algorithm reconstructions suffer from severe artifacts due to sparse data. In contrast, Model-Based Iterative Reconstruction (MBIR) algorithms, though better at mitigating noise through regularization, are too computationally costly for clinical use. This paper introduces a novel technique, denoted as the Deep Guess acceleration scheme, using a trained neural network both to quicken the regularized MBIR and to enhance the reconstruction accuracy. We integrate state-of-the-art deep learning tools to initialize a clever starting guess for a proximal algorithm solving a non-convex model and thus computing an interpretable solution image in a few iterations. Experimental results on real CT images demonstrate the Deep Guess effectiveness in (very) sparse tomographic protocols, where it overcomes its mere variational counterpart and many data-driven approaches at the state of the art. We also consider a ground truth-free implementation and test the robustness of the proposed framework to noise.
LGOct 24, 2025
On the flow matching interpretabilityFrancesco Pivi, Simone Gazza, Davide Evangelista et al.
Generative models based on flow matching have demonstrated remarkable success in various domains, yet they suffer from a fundamental limitation: the lack of interpretability in their intermediate generation steps. In fact these models learn to transform noise into data through a series of vector field updates, however the meaning of each step remains opaque. We address this problem by proposing a general framework constraining each flow step to be sampled from a known physical distribution. Flow trajectories are mapped to (and constrained to traverse) the equilibrium states of the simulated physical process. We implement this approach through the 2D Ising model in such a way that flow steps become thermal equilibrium points along a parametric cooling schedule. Our proposed architecture includes an encoder that maps discrete Ising configurations into a continuous latent space, a flow-matching network that performs temperature-driven diffusion, and a projector that returns to discrete Ising states while preserving physical constraints. We validate this framework across multiple lattice sizes, showing that it preserves physical fidelity while outperforming Monte Carlo generation in speed as the lattice size increases. In contrast with standard flow matching, each vector field represents a meaningful stepwise transition in the 2D Ising model's latent space. This demonstrates that embedding physical semantics into generative flows transforms opaque neural trajectories into interpretable physical processes.
CVMay 31, 2023
Ambiguity in solving imaging inverse problems with deep learning based operatorsDavide Evangelista, Elena Morotti, Elena Loli Piccolomini et al.
In recent years, large convolutional neural networks have been widely used as tools for image deblurring, because of their ability in restoring images very precisely. It is well known that image deblurring is mathematically modeled as an ill-posed inverse problem and its solution is difficult to approximate when noise affects the data. Really, one limitation of neural networks for deblurring is their sensitivity to noise and other perturbations, which can lead to instability and produce poor reconstructions. In addition, networks do not necessarily take into account the numerical formulation of the underlying imaging problem, when trained end-to-end. In this paper, we propose some strategies to improve stability without losing to much accuracy to deblur images with deep-learning based methods. First, we suggest a very small neural architecture, which reduces the execution time for training, satisfying a green AI need, and does not extremely amplify noise in the computed image. Second, we introduce a unified framework where a pre-processing step balances the lack of stability of the following, neural network-based, step. Two different pre-processors are presented: the former implements a strong parameter-free denoiser, and the latter is a variational model-based regularized formulation of the latent imaging problem. This framework is also formally characterized by mathematical analysis. Numerical experiments are performed to verify the accuracy and stability of the proposed approaches for image deblurring when unknown or not-quantified noise is present; the results confirm that they improve the network stability with respect to noise. In particular, the model-based framework represents the most reliable trade-off between visual precision and robustness.
LGJul 26, 2021
Dissecting FLOPs along input dimensions for GreenAI cost estimationsAndrea Asperti, Davide Evangelista, Moreno Marzolla
The term GreenAI refers to a novel approach to Deep Learning, that is more aware of the ecological impact and the computational efficiency of its methods. The promoters of GreenAI suggested the use of Floating Point Operations (FLOPs) as a measure of the computational cost of Neural Networks; however, that measure does not correlate well with the energy consumption of hardware equipped with massively parallel processing units like GPUs or TPUs. In this article, we propose a simple refinement of the formula used to compute floating point operations for convolutional layers, called α-FLOPs, explaining and correcting the traditional discrepancy with respect to different layers, and closer to reality. The notion of α-FLOPs relies on the crucial insight that, in case of inputs with multiple dimensions, there is no reason to believe that the speedup offered by parallelism will be uniform along all different axes.