LGNov 25, 2022
Copula Density Neural EstimationNunzio A. Letizia, Nicola Novello, Andrea M. Tonello
Probability density estimation from observed data constitutes a central task in statistics. In this brief, we focus on the problem of estimating the copula density associated to any observed data, as it fully describes the dependence between random variables. We separate univariate marginal distributions from the joint dependence structure in the data, the copula itself, and we model the latter with a neural network-based method referred to as copula density neural estimation (CODINE). Results show that the novel learning approach is capable of modeling complex distributions and can be applied for mutual information estimation and data generation.
36.4LGMay 11
Empty SPACE: Cross-Attention Sparsity for Concept Erasure in Diffusion ModelsNicola Novello, Andrea M. Tonello
Erasing specific concepts from text-to-image diffusion models is essential for avoiding the generation of copyrighted and explicit content. Closed-form concept erasure methods offer a fast alternative to backpropagation-based techniques, but they become less effective when scaling from smaller models such as Stable Diffusion 1.5 to larger models like Stable Diffusion XL. To maintain erasure effectiveness in these larger-scale architectures, we propose SParse cross-Attention-based Concept Erasure (SPACE). SPACE iteratively modifies the cross-attention parameters of a model with a closed-form update that jointly induces sparsity and erases target concepts. By concentrating the concept mapping to a lower-dimensional subspace, SPACE achieves superior erasure efficacy compared to dense baselines. Extensive experimental results show improvements in erasure effectiveness and robustness against adversarial prompts. Furthermore, SPACE achieves 80\%-90\% cross-attention sparsity, reducing the storage requirements for saving the modified parameters by 70\%, demonstrating its memory efficiency.
LGJan 2, 2024
$f$-Divergence Based Classification: Beyond the Use of Cross-EntropyNicola Novello, Andrea M. Tonello
In deep learning, classification tasks are formalized as optimization problems often solved via the minimization of the cross-entropy. However, recent advancements in the design of objective functions allow the usage of the $f$-divergence to generalize the formulation of the optimization problem for classification. We adopt a Bayesian perspective and formulate the classification task as a maximum a posteriori probability problem. We propose a class of objective functions based on the variational representation of the $f$-divergence. Furthermore, driven by the challenge of improving the state-of-the-art approach, we propose a bottom-up method that leads us to the formulation of an objective function corresponding to a novel $f$-divergence referred to as shifted log (SL). We theoretically analyze the objective functions proposed and numerically test them in three application scenarios: toy examples, image datasets, and signal detection/decoding problems. The analyzed scenarios demonstrate the effectiveness of the proposed approach and that the SL divergence achieves the highest classification accuracy in almost all the considered cases.
LGApr 9, 2025
Robust Classification with Noisy Labels Based on Posterior MaximizationNicola Novello, Andrea M. Tonello
Designing objective functions robust to label noise is crucial for real-world classification algorithms. In this paper, we investigate the robustness to label noise of an $f$-divergence-based class of objective functions recently proposed for supervised classification, herein referred to as $f$-PML. We show that, in the presence of label noise, any of the $f$-PML objective functions can be corrected to obtain a neural network that is equal to the one learned with the clean dataset. Additionally, we propose an alternative and novel correction approach that, during the test phase, refines the posterior estimated by the neural network trained in the presence of label noise. Then, we demonstrate that, even if the considered $f$-PML objective functions are not symmetric, they are robust to symmetric label noise for any choice of $f$-divergence, without the need for any correction approach. This allows us to prove that the cross-entropy, which belongs to the $f$-PML class, is robust to symmetric label noise. Finally, we show that such a class of objective functions can be used together with refined training strategies, achieving competitive performance against state-of-the-art techniques of classification with label noise.
LGSep 25, 2025
A Unified Framework for Diffusion Model Unlearning with f-DivergenceNicola Novello, Federico Fontana, Luigi Cinque et al.
Machine unlearning aims to remove specific knowledge from a trained model. While diffusion models (DMs) have shown remarkable generative capabilities, existing unlearning methods for text-to-image (T2I) models often rely on minimizing the mean squared error (MSE) between the output distribution of a target and an anchor concept. We show that this MSE-based approach is a special case of a unified $f$-divergence-based framework, in which any $f$-divergence can be utilized. We analyze the benefits of using different $f$-divergences, that mainly impact the convergence properties of the algorithm and the quality of unlearning. The proposed unified framework offers a flexible paradigm that allows to select the optimal divergence for a specific application, balancing different trade-offs between aggressive unlearning and concept preservation.
LGMay 31, 2023
Mutual Information Estimation via $f$-Divergence and Data DerangementsNunzio A. Letizia, Nicola Novello, Andrea M. Tonello
Estimating mutual information accurately is pivotal across diverse applications, from machine learning to communications and biology, enabling us to gain insights into the inner mechanisms of complex systems. Yet, dealing with high-dimensional data presents a formidable challenge, due to its size and the presence of intricate relationships. Recently proposed neural methods employing variational lower bounds on the mutual information have gained prominence. However, these approaches suffer from either high bias or high variance, as the sample size and the structure of the loss function directly influence the training process. In this paper, we propose a novel class of discriminative mutual information estimators based on the variational representation of the $f$-divergence. We investigate the impact of the permutation function used to obtain the marginal training samples and present a novel architectural solution based on derangements. The proposed estimator is flexible since it exhibits an excellent bias/variance trade-off. The comparison with state-of-the-art neural estimators, through extensive experimentation within established reference scenarios, shows that our approach offers higher accuracy and lower complexity.