Andrea M. Tonello

LG
h-index5
12papers
135citations
Novelty50%
AI Score46

12 Papers

LGNov 25, 2022
Copula Density Neural Estimation

Nunzio A. Letizia, Nicola Novello, Andrea M. Tonello

Probability density estimation from observed data constitutes a central task in statistics. In this brief, we focus on the problem of estimating the copula density associated to any observed data, as it fully describes the dependence between random variables. We separate univariate marginal distributions from the joint dependence structure in the data, the copula itself, and we model the latter with a neural network-based method referred to as copula density neural estimation (CODINE). Results show that the novel learning approach is capable of modeling complex distributions and can be applied for mutual information estimation and data generation.

ITMay 14, 2022
MIND: Maximum Mutual Information Based Neural Decoder

Andrea M. Tonello, Nunzio A. Letizia

We are assisting at a growing interest in the development of learning architectures with application to digital communication systems. Herein, we consider the detection/decoding problem. We aim at developing an optimal neural architecture for such a task. The definition of the optimal criterion is a fundamental step. We propose to use the mutual information (MI) of the channel input-output signal pair, which yields to the minimization of the a-posteriori information of the transmitted codeword given the communication channel output observation. The computation of the a-posteriori information is a formidable task, and for the majority of channels it is unknown. Therefore, it has to be learned. For such an objective, we propose a novel neural estimator based on a discriminative formulation. This leads to the derivation of the mutual information neural decoder (MIND). The developed neural architecture is capable not only to solve the decoding problem in unknown channels, but also to return an estimate of the average MI achieved with the coding scheme, as well as the decoding error probability. Several numerical results are reported and compared with maximum a-posteriori and maximum likelihood decoding strategies.

36.4LGMay 11
Empty SPACE: Cross-Attention Sparsity for Concept Erasure in Diffusion Models

Nicola Novello, Andrea M. Tonello

Erasing specific concepts from text-to-image diffusion models is essential for avoiding the generation of copyrighted and explicit content. Closed-form concept erasure methods offer a fast alternative to backpropagation-based techniques, but they become less effective when scaling from smaller models such as Stable Diffusion 1.5 to larger models like Stable Diffusion XL. To maintain erasure effectiveness in these larger-scale architectures, we propose SParse cross-Attention-based Concept Erasure (SPACE). SPACE iteratively modifies the cross-attention parameters of a model with a closed-form update that jointly induces sparsity and erases target concepts. By concentrating the concept mapping to a lower-dimensional subspace, SPACE achieves superior erasure efficacy compared to dense baselines. Extensive experimental results show improvements in erasure effectiveness and robustness against adversarial prompts. Furthermore, SPACE achieves 80\%-90\% cross-attention sparsity, reducing the storage requirements for saving the modified parameters by 70\%, demonstrating its memory efficiency.

LGJan 2, 2024
$f$-Divergence Based Classification: Beyond the Use of Cross-Entropy

Nicola Novello, Andrea M. Tonello

In deep learning, classification tasks are formalized as optimization problems often solved via the minimization of the cross-entropy. However, recent advancements in the design of objective functions allow the usage of the $f$-divergence to generalize the formulation of the optimization problem for classification. We adopt a Bayesian perspective and formulate the classification task as a maximum a posteriori probability problem. We propose a class of objective functions based on the variational representation of the $f$-divergence. Furthermore, driven by the challenge of improving the state-of-the-art approach, we propose a bottom-up method that leads us to the formulation of an objective function corresponding to a novel $f$-divergence referred to as shifted log (SL). We theoretically analyze the objective functions proposed and numerically test them in three application scenarios: toy examples, image datasets, and signal detection/decoding problems. The analyzed scenarios demonstrate the effectiveness of the proposed approach and that the SL divergence achieves the highest classification accuracy in almost all the considered cases.

LGApr 9, 2025
Robust Classification with Noisy Labels Based on Posterior Maximization

Nicola Novello, Andrea M. Tonello

Designing objective functions robust to label noise is crucial for real-world classification algorithms. In this paper, we investigate the robustness to label noise of an $f$-divergence-based class of objective functions recently proposed for supervised classification, herein referred to as $f$-PML. We show that, in the presence of label noise, any of the $f$-PML objective functions can be corrected to obtain a neural network that is equal to the one learned with the clean dataset. Additionally, we propose an alternative and novel correction approach that, during the test phase, refines the posterior estimated by the neural network trained in the presence of label noise. Then, we demonstrate that, even if the considered $f$-PML objective functions are not symmetric, they are robust to symmetric label noise for any choice of $f$-divergence, without the need for any correction approach. This allows us to prove that the cross-entropy, which belongs to the $f$-PML class, is robust to symmetric label noise. Finally, we show that such a class of objective functions can be used together with refined training strategies, achieving competitive performance against state-of-the-art techniques of classification with label noise.

LGSep 25, 2025
A Unified Framework for Diffusion Model Unlearning with f-Divergence

Nicola Novello, Federico Fontana, Luigi Cinque et al.

Machine unlearning aims to remove specific knowledge from a trained model. While diffusion models (DMs) have shown remarkable generative capabilities, existing unlearning methods for text-to-image (T2I) models often rely on minimizing the mean squared error (MSE) between the output distribution of a target and an anchor concept. We show that this MSE-based approach is a special case of a unified $f$-divergence-based framework, in which any $f$-divergence can be utilized. We analyze the benefits of using different $f$-divergences, that mainly impact the convergence properties of the algorithm and the quality of unlearning. The proposed unified framework offers a flexible paradigm that allows to select the optimal divergence for a specific application, balancing different trade-offs between aggressive unlearning and concept preservation.

LGMay 31, 2023
Mutual Information Estimation via $f$-Divergence and Data Derangements

Nunzio A. Letizia, Nicola Novello, Andrea M. Tonello

Estimating mutual information accurately is pivotal across diverse applications, from machine learning to communications and biology, enabling us to gain insights into the inner mechanisms of complex systems. Yet, dealing with high-dimensional data presents a formidable challenge, due to its size and the presence of intricate relationships. Recently proposed neural methods employing variational lower bounds on the mutual information have gained prominence. However, these approaches suffer from either high bias or high variance, as the sample size and the structure of the loss function directly influence the training process. In this paper, we propose a novel class of discriminative mutual information estimators based on the variational representation of the $f$-divergence. We investigate the impact of the permutation function used to obtain the marginal training samples and present a novel architectural solution based on derangements. The proposed estimator is flexible since it exhibits an excellent bias/variance trade-off. The comparison with state-of-the-art neural estimators, through extensive experimentation within established reference scenarios, shows that our approach offers higher accuracy and lower complexity.

ITJul 7, 2021
Discriminative Mutual Information Estimators for Channel Capacity Learning

Nunzio A. Letizia, Andrea M. Tonello

Channel capacity plays a crucial role in the development of modern communication systems as it represents the maximum rate at which information can be reliably transmitted over a communication channel. Nevertheless, for the majority of channels, finding a closed-form capacity expression remains an open challenge. This is because it requires to carry out two formidable tasks a) the computation of the mutual information between the channel input and output, and b) its maximization with respect to the signal distribution at the channel input. In this paper, we address both tasks. Inspired by implicit generative models, we propose a novel cooperative framework to automatically learn the channel capacity, for any type of memory-less channel. In particular, we firstly develop a new methodology to estimate the mutual information directly from a discriminator typically deployed to train adversarial networks, referred to as discriminative mutual information estimator (DIME). Secondly, we include the discriminator in a cooperative channel capacity learning framework, referred to as CORTICAL, where a discriminator learns to distinguish between dependent and independent channel input-output samples while a generator learns to produce the optimal channel input distribution for which the discriminator exhibits the best performance. Lastly, we prove that a particular choice of the cooperative value function solves the channel capacity estimation problem. Simulation results demonstrate that the proposed method offers high accuracy.

ITSep 11, 2020
Capacity-Approaching Autoencoders for Communications

Nunzio A. Letizia, Andrea M. Tonello

The autoencoder concept has fostered the reinterpretation and the design of modern communication systems. It consists of an encoder, a channel, and a decoder block which modify their internal neural structure in an end-to-end learning fashion. However, the current approach to train an autoencoder relies on the use of the cross-entropy loss function. This approach can be prone to overfitting issues and often fails to learn an optimal system and signal representation (code). In addition, less is known about the autoencoder ability to design channel capacity-approaching codes, i.e., codes that maximize the input-output information under a certain power constraint. The task being even more formidable for an unknown channel for which the capacity is unknown and therefore it has to be learnt. In this paper, we address the challenge of designing capacity-approaching codes by incorporating the presence of the communication channel into a novel loss function for the autoencoder training. In particular, we exploit the mutual information between the transmitted and received signals as a regularization term in the cross-entropy loss function, with the aim of controlling the amount of information stored. By jointly maximizing the mutual information and minimizing the cross-entropy, we propose a methodology that a) computes an estimate of the channel capacity and b) constructs an optimal coded signal approaching it. Several simulation results offer evidence of the potentiality of the proposed method.

SPApr 24, 2019
Machine Learning Tips and Tricks for Power Line Communications

Andrea M. Tonello, Nunzio A. Letizia, Davide Righini et al.

A great deal of attention has been recently given to Machine Learning (ML) techniques in many different application fields. This paper provides a vision of what ML can do in Power Line Communications (PLC). We firstly and briefly describe classical formulations of ML, and distinguish deterministic from statistical learning models with relevance to communications. We then discuss ML applications in PLC for each layer, namely, for characterization and modeling, for the development of physical layer algorithms, for media access control and networking. Finally, other applications of PLC that can benefit from the usage of ML, as grid diagnostics, are analyzed. Illustrative numerical examples are reported to serve the purpose of validating the ideas and motivate future research endeavors in this stimulating signal/data processing field.

CRSep 25, 2018
Physical Layer Key Generation for Secure Power Line Communications

Federico Passerini, Andrea M. Tonello

Leakage of information in power line communication networks is a threat to privacy and security both in smart grids and in-home applications. A way to enhance security is to encode the transmitted information with a secret key. Relying on the channel properties, it is possible to generate a common key at the two communication ends without transmitting it through the broadcast channel. Since the key is generated locally, it is intrinsically secure from a possible eavesdropper. Most of the existing physical layer key generation techniques have been developed for symmetric channels. However, the power line channel is in general not symmetric, but just reciprocal. Therefore, in this paper, we propose two novel methods that exploit the reciprocity of the power line channel to generate common information at the two intended users. This information is processed through different quantization techniques to generate secret keys. To assess the security of the generated keys, we analyze the spatial correlation of the power line channels and verify the low correlation of the possible eavesdropping channels. The two proposed methods are tested on a measurement dataset. The results show that the information leaked to possible eavesdroppers has very low correlation to any secret key.

HCMay 6, 2015
An Open Solution to Provide Personalized Feedback for Building Energy Management

Andrea Monacchi, Fabio Versolatto, Manuel Herold et al.

The integration of renewable energy sources increases the complexity in mantaining the power grid. In particular, the highly dynamic nature of generation and consumption demands for a better utilization of energy resources, which seen the cost of storage infrastructure, can only be achieved through demand-response. Accordingly, the availability of energy and potential overload situations can be reflected using a price signal. The effectiveness of this mechanism arises from the flexibility of device operation, which is nevertheless heavily reliant on the exchange of information between the grid and its consumers. In this paper, we investigate the capability of an interactive energy management system to timely inform users on energy usage, in order to promote an optimal use of local resources. In particular, we analyze data being collected in several households in Italy and Austria to gain insights into usage behavior and drive the design of more effective systems. The outcome is the formulation of energy efficiency policies for residential buildings, as well as the design of an energy management system, consisting of hardware measurement units and a management software. The Mjölnir framework, which we release for open use, provides a platform where various feedback concepts can be implemented and assessed. This includes widgets displaying disaggregated and aggregated consumption information, as well as daily production and tailored advices. The formulated policies were implemented as an advisor widget able to autonomously analyze usage and provide tailored energy feedback.