Luigi Malagò

8papers

38citations

Novelty39%

AI Score20

Ranked #193,177 of 205,806 authors (top 94%)#41,039 in LG (top 97%)

8 Papers

SDFeb 24, 2021

Automatic Feature Extraction for Heartbeat Anomaly Detection

Robert-George Colt, Csongor-Huba Várady, Riccardo Volpi et al.

We focus on automatic feature extraction for raw audio heartbeat sounds, aimed at anomaly detection applications in healthcare. We learn features with the help of an autoencoder composed by a 1D non-causal convolutional encoder and a WaveNet decoder trained with a modified objective based on variational inference, employing the Maximum Mean Discrepancy (MMD). Moreover we model the latent distribution using a Gaussian chain graphical model to capture temporal correlations which characterize the encoded signals. After training the autoencoder on the reconstruction task in a unsupervised manner, we test the significance of the learned latent representations by training an SVM to predict anomalies. We evaluate the methods on a problem proposed by the PASCAL Classifying Heart Sounds Challenge and we compare with results in the literature.

STSep 20, 2020

Lagrangian and Hamiltonian Mechanics for Probabilities on the Statistical Manifold

Goffredo Chirco, Luigi Malagò, Giovanni Pistone

We provide an Information-Geometric formulation of Classical Mechanics on the Riemannian manifold of probability distributions, which is an affine manifold endowed with a dually-flat connection. In a non-parametric formalism, we consider the full set of positive probability functions on a finite sample space, and we provide a specific expression for the tangent and cotangent spaces over the statistical manifold, in terms of a Hilbert bundle structure that we call the Statistical Bundle. In this setting, we compute velocities and accelerations of a one-dimensional statistical model using the canonical dual pair of parallel transports and define a coherent formalism for Lagrangian and Hamiltonian mechanics on the bundle. Finally, in a series of examples, we show how our formalism provides a consistent framework for accelerated natural gradient dynamics on the probability simplex, paving the way for direct applications in optimization, game theory and neural networks.

LGAug 15, 2020

Natural Reweighted Wake-Sleep

Csongor Várady, Riccardo Volpi, Luigi Malagò et al.

Helmholtz Machines (HMs) are a class of generative models composed of two Sigmoid Belief Networks (SBNs), acting respectively as an encoder and a decoder. These models are commonly trained using a two-step optimization algorithm called Wake-Sleep (WS) and more recently by improved versions, such as Reweighted Wake-Sleep (RWS) and Bidirectional Helmholtz Machines (BiHM). The locality of the connections in an SBN induces sparsity in the Fisher Information Matrices associated to the probabilistic models, in the form of a finely-grained block-diagonal structure. In this paper we exploit this property to efficiently train SBNs and HMs using the natural gradient. We present a novel algorithm, called Natural Reweighted Wake-Sleep (NRWS), that corresponds to the geometric adaptation of its standard version. In a similar manner, we also introduce Natural Bidirectional Helmholtz Machine (NBiHM). Differently from previous work, we will show how for HMs the natural gradient can be efficiently computed without the need of introducing any approximation in the structure of the Fisher information matrix. The experiments performed on standard datasets from the literature show a consistent improvement of NRWS and NBiHM not only with respect to their non-geometric baselines but also with respect to state-of-the-art training algorithms for HMs. The improvement is quantified both in terms of speed of convergence as well as value of the log-likelihood reached after training.

LGJul 24, 2020

Improved Slice-wise Tumour Detection in Brain MRIs by Computing Dissimilarities between Latent Representations

Alexandra-Ioana Albu, Alina Enescu, Luigi Malagò

Anomaly detection for Magnetic Resonance Images (MRIs) can be solved with unsupervised methods by learning the distribution of healthy images and identifying anomalies as outliers. In presence of an additional dataset of unlabelled data containing also anomalies, the task can be framed as a semi-supervised task with negative and unlabelled sample points. Recently, in Albu et al., 2020, we have proposed a slice-wise semi-supervised method for tumour detection based on the computation of a dissimilarity function in the latent space of a Variational AutoEncoder, trained on unlabelled data. The dissimilarity is computed between the encoding of the image and the encoding of its reconstruction obtained through a different autoencoder trained only on healthy images. In this paper we present novel and improved results for our method, obtained by training the Variational AutoEncoders on a subset of the HCP and BRATS-2018 datasets and testing on the remaining individuals. We show that by training the models on higher resolution images and by improving the quality of the reconstructions, we obtain results which are comparable with different baselines, which employ a single VAE trained on healthy individuals. As expected, the performance of our method increases with the size of the threshold used to determine the presence of an anomaly.

COMay 4, 2020

Parameters Estimation from the 21 cm signal using Variational Inference

Héctor J. Hortúa, Riccardo Volpi, Luigi Malagò

Upcoming experiments such as Hydrogen Epoch of Reionization Array (HERA) and Square Kilometre Array (SKA) are intended to measure the 21cm signal over a wide range of redshifts, representing an incredible opportunity in advancing our understanding about the nature of cosmic Reionization. At the same time these kind of experiments will present new challenges in processing the extensive amount of data generated, calling for the development of automated methods capable of precisely estimating physical parameters and their uncertainties. In this paper we employ Variational Inference, and in particular Bayesian Neural Networks, as an alternative to MCMC in 21 cm observations to report credible estimations for cosmological and astrophysical parameters and assess the correlations among them.

LGDec 4, 2019

Natural Alpha Embeddings

Riccardo Volpi, Luigi Malagò

Learning an embedding for a large collection of items is a popular approach to overcome the computational limitations associated to one-hot encodings. The aim of item embedding is to learn a low dimensional space for the representations, able to capture with its geometry relevant features or relationships for the data at hand. This can be achieved for example by exploiting adjacencies among items in large sets of unlabelled data. In this paper we interpret in an Information Geometric framework the item embeddings obtained from conditional models. By exploiting the $α$-geometry of the exponential family, first introduced by Amari, we introduce a family of natural $α$-embeddings represented by vectors in the tangent space of the probability simplex, which includes as a special case standard approaches available in the literature. A typical example is given by word embeddings, commonly used in natural language processing, such as Word2Vec and GloVe. In our analysis, we show how the $α$-deformation parameter can impact on standard evaluation tasks.

IMNov 19, 2019

Parameters Estimation for the Cosmic Microwave Background with Bayesian Neural Networks

Hector J. Hortua, Riccardo Volpi, Dimitri Marinelli et al.

In this paper, we present the first study that compares different models of Bayesian Neural Networks (BNNs) to predict the posterior distribution of the cosmological parameters directly from the Cosmic Microwave Background temperature and polarization maps. We focus our analysis on four different methods to sample the weights of the network during training: Dropout, DropConnect, Reparameterization Trick (RT), and Flipout. We find out that Flipout outperforms all other methods regardless of the architecture used, and provides tighter constraints for the cosmological parameters. Moreover we compare with MCMC posterior analysis obtaining comparable error correlation among parameters, with BNNs being orders of magnitude faster in inference, although less accurate. Thanks to the speed of the inference process with BNNs, the posterior distribution, outcome of the neural network, can be used as the initial proposal for the Markov Chain. We show that this combined approach increases the acceptance rate in the Metropolis-Hasting algorithm and accelerates the convergence of the MCMC, while reaching the same final accuracy. In the second part of the paper, we present a guide to the training and calibration of a successful multi-channel BNN for the CMB temperature and polarization map. We show how tuning the regularization parameter for the standard deviation of the approximate posterior on the weights in Flipout and RT we can produce unbiased and reliable uncertainty estimates, i.e., the regularizer acts like a hyperparameter analogous to the dropout rate in Dropout. Finally, we show how polarization, when combined with the temperature in a unique multi-channel tensor fed to a single BNN, helps to break degeneracies among parameters and provides stringent constraints.

LGJul 5, 2018

Learning in Variational Autoencoders with Kullback-Leibler and Renyi Integral Bounds

Septimia Sârbu, Riccardo Volpi, Alexandra Peşte et al.

In this paper we propose two novel bounds for the log-likelihood based on Kullback-Leibler and the Rényi divergences, which can be used for variational inference and in particular for the training of Variational AutoEncoders. Our proposal is motivated by the difficulties encountered in training VAEs on continuous datasets with high contrast images, such as those with handwritten digits and characters, where numerical issues often appear unless noise is added, either to the dataset during training or to the generative model given by the decoder. The new bounds we propose, which are obtained from the maximization of the likelihood of an interval for the observations, allow numerically stable training procedures without the necessity of adding any extra source of noise to the data.