L. Martino

CO
13papers
677citations
Novelty42%
AI Score39

13 Papers

MEAug 17, 2023
Spectral information criterion for automatic elbow detection

L. Martino, R. San Millan-Castillo, E. Morgado

We introduce a generalized information criterion that contains other well-known information criteria, such as Bayesian information Criterion (BIC) and Akaike information criterion (AIC), as special cases. Furthermore, the proposed spectral information criterion (SIC) is also more general than the other information criteria, e.g., since the knowledge of a likelihood function is not strictly required. SIC extracts geometric features of the error curve and, as a consequence, it can be considered an automatic elbow detector. SIC provides a subset of all possible models, with a cardinality that often is much smaller than the total number of possible models. The elements of this subset are elbows of the error curve. A practical rule for selecting a unique model within the sets of elbows is suggested as well. Theoretical invariance properties of SIC are analyzed. Moreover, we test SIC in ideal scenarios where provides always the optimal expected results. We also test SIC in several numerical experiments: some involving synthetic data, and two experiments involving real datasets. They are all real-world applications such as clustering, variable selection, or polynomial order selection, to name a few. The results show the benefits of the proposed scheme. Matlab code related to the experiments is also provided. Possible future research lines are finally discussed.

CEFeb 26
An automatic counting algorithm for the quantification and uncertainty analysis of the number of microglial cells trainable in small and heterogeneous datasets

L. Martino, M. M. Garcia, P. S. Paradas et al.

Counting immunopositive cells on biological tissues generally requires either manual annotation or (when available) automatic rough systems, for scanning signal surface and intensity in whole slide imaging. In this work, we tackle the problem of counting microglial cells in lumbar spinal cord cross-sections of rats by omitting cell detection and focusing only on the counting task. Manual cell counting is, however, a time-consuming task and additionally entails extensive personnel training. The classic automatic color-based methods roughly inform about the total labeled area and intensity (protein quantification) but do not specifically provide information on cell number. Since the images to be analyzed have a high resolution but a huge amount of pixels contain just noise or artifacts, we first perform a pre-processing generating several filtered images {(providing a tailored, efficient feature extraction)}. Then, we design an automatic kernel counter that is a non-parametric and non-linear method. The proposed scheme can be easily trained in small datasets since, in its basic version, it relies only on one hyper-parameter. However, being non-parametric and non-linear, the proposed algorithm is flexible enough to express all the information contained in rich and heterogeneous datasets as well (providing the maximum overfit if required). Furthermore, the proposed kernel counter also provides uncertainty estimation of the given prediction, and can directly tackle the case of receiving several expert opinions over the same image. Different numerical experiments with artificial and real datasets show very promising results. Related Matlab code is also provided.

LGAug 1, 2021
A survey of Monte Carlo methods for noisy and costly densities with application to reinforcement learning and ABC

F. Llorente, L. Martino, J. Read et al.

This survey gives an overview of Monte Carlo methodologies using surrogate models, for dealing with densities which are intractable, costly, and/or noisy. This type of problem can be found in numerous real-world scenarios, including stochastic optimization and reinforcement learning, where each evaluation of a density function may incur some computationally-expensive or even physical (real-world activity) cost, likely to give different results each time. The surrogate model does not incur this cost, but there are important trade-offs and considerations involved in the choice and design of such methodologies. We classify the different methodologies into three main classes and describe specific instances of algorithms under a unified notation. A modular scheme which encompasses the considered methods is also presented. A range of application scenarios is discussed, with special attention to the likelihood-free setting and reinforcement learning. Several numerical comparisons are also provided.

COJul 25, 2021
A Survey of Monte Carlo Methods for Parameter Estimation

D. Luengo, L. Martino, M. Bugallo et al.

Statistical signal processing applications usually require the estimation of some parameters of interest given a set of observed data. These estimates are typically obtained either by solving a multi-variate optimization problem, as in the maximum likelihood (ML) or maximum a posteriori (MAP) estimators, or by performing a multi-dimensional integration, as in the minimum mean squared error (MMSE) estimators. Unfortunately, analytical expressions for these estimators cannot be found in most real-world applications, and the Monte Carlo (MC) methodology is one feasible approach. MC methods proceed by drawing random samples, either from the desired distribution or from a simpler one, and using them to compute consistent estimators. The most important families of MC algorithms are Markov chain MC (MCMC) and importance sampling (IS). On the one hand, MCMC methods draw samples from a proposal density, building then an ergodic Markov chain whose stationary distribution is the desired distribution by accepting or rejecting those candidate samples as the new state of the chain. On the other hand, IS techniques draw samples from a simple proposal density, and then assign them suitable weights that measure their quality in some appropriate way. In this paper, we perform a thorough review of MC methods for the estimation of static parameters in signal processing applications. A historical note on the development of MC schemes is also provided, followed by the basic MC method and a brief description of the rejection sampling (RS) algorithm, as well as three sections describing many of the most relevant MCMC and IS algorithms, and their combined use.

COJul 24, 2021
Automatic tempered posterior distributions for Bayesian inversion problems

L. Martino, F. Llorente, E. Curbelo et al.

We propose a novel adaptive importance sampling scheme for Bayesian inversion problems where the inference of the variables of interest and the power of the data noise is split. More specifically, we consider a Bayesian analysis for the variables of interest (i.e., the parameters of the model to invert), whereas we employ a maximum likelihood approach for the estimation of the noise power. The whole technique is implemented by means of an iterative procedure, alternating sampling and optimization steps. Moreover, the noise power is also used as a tempered parameter for the posterior distribution of the the variables of interest. Therefore, a sequence of tempered posterior densities is generated, where the tempered parameter is automatically selected according to the actual estimation of the noise power. A complete Bayesian study over the model parameters and the scale parameter can be also performed. Numerical experiments show the benefits of the proposed approach.

COMay 6, 2021
MCMC-driven importance samplers

F. Llorente, E. Curbelo, L. Martino et al.

Monte Carlo sampling methods are the standard procedure for approximating complicated integrals of multidimensional posterior distributions in Bayesian inference. In this work, we focus on the class of Layered Adaptive Importance Sampling (LAIS) scheme, which is a family of adaptive importance samplers where Markov chain Monte Carlo algorithms are employed to drive an underlying multiple importance sampling scheme. The modular nature of LAIS allows for different possible implementations, yielding a variety of different performance and computational costs. In this work, we propose different enhancements of the classical LAIS setting in order to increase the efficiency and reduce the computational cost, of both upper and lower layers. The different variants address computational challenges arising in real-world applications, for instance with highly concentrated posterior distributions. Furthermore, we introduce different strategies for designing cheaper schemes, for instance, recycling samples generated in the upper layer and using them in the final estimators in the lower layer. Different numerical experiments, considering several challenging scenarios, show the benefits of the proposed schemes comparing with benchmark methods presented in the literature.

COOct 20, 2020
Deep Importance Sampling based on Regression for Model Inversion and Emulation

F. Llorente, L. Martino, D. Delgado et al.

Understanding systems by forward and inverse modeling is a recurrent topic of research in many domains of science and engineering. In this context, Monte Carlo methods have been widely used as powerful tools for numerical inference and optimization. They require the choice of a suitable proposal density that is crucial for their performance. For this reason, several adaptive importance sampling (AIS) schemes have been proposed in the literature. We here present an AIS framework called Regression-based Adaptive Deep Importance Sampling (RADIS). In RADIS, the key idea is the adaptive construction via regression of a non-parametric proposal density (i.e., an emulator), which mimics the posterior distribution and hence minimizes the mismatch between proposal and target densities. RADIS is based on a deep architecture of two (or more) nested IS schemes, in order to draw samples from the constructed emulator. The algorithm is highly efficient since employs the posterior approximation as proposal density, which can be improved adding more support points. As a consequence, RADIS asymptotically converges to an exact sampler under mild conditions. Additionally, the emulator produced by RADIS can be in turn used as a cheap surrogate model for further studies. We introduce two specific RADIS implementations that use Gaussian Processes (GPs) and Nearest Neighbors (NN) for constructing the emulator. Several numerical experiments and comparisons show the benefits of the proposed schemes. A real-world application in remote sensing model inversion and emulation confirms the validity of the approach.

COMay 31, 2020
Adaptive quadrature schemes for Bayesian inference via active learning

F. Llorente, L. Martino, V. Elvira et al.

Numerical integration and emulation are fundamental topics across scientific fields. We propose novel adaptive quadrature schemes based on an active learning procedure. We consider an interpolative approach for building a surrogate posterior density, combining it with Monte Carlo sampling methods and other quadrature rules. The nodes of the quadrature are sequentially chosen by maximizing a suitable acquisition function, which takes into account the current approximation of the posterior and the positions of the nodes. This maximization does not require additional evaluations of the true posterior. We introduce two specific schemes based on Gaussian and Nearest Neighbors (NN) bases. For the Gaussian case, we also provide a novel procedure for fitting the bandwidth parameter, in order to build a suitable emulator of a density function. With both techniques, we always obtain a positive estimation of the marginal likelihood (a.k.a., Bayesian evidence). An equivalent importance sampling interpretation is also described, which allows the design of extended schemes. Several theoretical results are provided and discussed. Numerical results show the advantage of the proposed approach, including a challenging inference problem in an astronomic dynamical model, with the goal of revealing the number of planets orbiting a star.

COApr 10, 2017
Group Importance Sampling for Particle Filtering and MCMC

L. Martino, V. Elvira, G. Camps-Valls

Bayesian methods and their implementations by means of sophisticated Monte Carlo techniques have become very popular in signal processing over the last years. Importance Sampling (IS) is a well-known Monte Carlo technique that approximates integrals involving a posterior distribution by means of weighted samples. In this work, we study the assignation of a single weighted sample which compresses the information contained in a population of weighted samples. Part of the theory that we present as Group Importance Sampling (GIS) has been employed implicitly in different works in the literature. The provided analysis yields several theoretical and practical consequences. For instance, we discuss the application of GIS into the Sequential Importance Resampling framework and show that Independent Multiple Try Metropolis schemes can be interpreted as a standard Metropolis-Hastings algorithm, following the GIS approach. We also introduce two novel Markov Chain Monte Carlo (MCMC) techniques based on GIS. The first one, named Group Metropolis Sampling method, produces a Markov chain of sets of weighted samples. All these sets are then employed for obtaining a unique global estimator. The second one is the Distributed Particle Metropolis-Hastings technique, where different parallel particle filters are jointly used to drive an MCMC algorithm. Different resampled trajectories are compared and then tested with a proper acceptance probability. The novel schemes are tested in different numerical experiments such as learning the hyperparameters of Gaussian Processes, two localization problems in a wireless sensor network (with synthetic and real data) and the tracking of vegetation parameters given satellite observations, where they are compared with several benchmark Monte Carlo techniques. Three illustrative Matlab demos are also provided.

COJul 30, 2015
Orthogonal parallel MCMC methods for sampling and optimization

L. Martino, V. Elvira, D. Luengo et al.

Monte Carlo (MC) methods are widely used for Bayesian inference and optimization in statistics, signal processing and machine learning. A well-known class of MC methods are Markov Chain Monte Carlo (MCMC) algorithms. In order to foster better exploration of the state space, specially in high-dimensional applications, several schemes employing multiple parallel MCMC chains have been recently introduced. In this work, we describe a novel parallel interacting MCMC scheme, called {\it orthogonal MCMC} (O-MCMC), where a set of "vertical" parallel MCMC chains share information using some "horizontal" MCMC techniques working on the entire population of current states. More specifically, the vertical chains are led by random-walk proposals, whereas the horizontal MCMC techniques employ independent proposals, thus allowing an efficient combination of global exploration and local approximation. The interaction is contained in these horizontal iterations. Within the analysis of different implementations of O-MCMC, novel schemes in order to reduce the overall computational cost of parallel multiple try Metropolis (MTM) chains are also presented. Furthermore, a modified version of O-MCMC for optimization is provided by considering parallel simulated annealing (SA) algorithms. Numerical results show the advantages of the proposed sampling scheme in terms of efficiency in the estimation, as well as robustness in terms of independence with respect to initial values and the choice of the parameters.

COMay 18, 2015
Layered Adaptive Importance Sampling

L. Martino, V. Elvira, D. Luengo et al.

Monte Carlo methods represent the "de facto" standard for approximating complicated integrals involving multidimensional target distributions. In order to generate random realizations from the target distribution, Monte Carlo techniques use simpler proposal probability densities to draw candidate samples. The performance of any such method is strictly related to the specification of the proposal distribution, such that unfortunate choices easily wreak havoc on the resulting estimators. In this work, we introduce a layered (i.e., hierarchical) procedure to generate samples employed within a Monte Carlo scheme. This approach ensures that an appropriate equivalent proposal density is always obtained automatically (thus eliminating the risk of a catastrophic performance), although at the expense of a moderate increase in the complexity. Furthermore, we provide a general unified importance sampling (IS) framework, where multiple proposal densities are employed and several IS schemes are introduced by applying the so-called deterministic mixture approach. Finally, given these schemes, we also propose a novel class of adaptive importance samplers using a population of proposals, where the adaptation is driven by independent parallel or interacting Markov Chain Monte Carlo (MCMC) chains. The resulting algorithms efficiently combine the benefits of both IS and MCMC methods.

MLJan 20, 2015
Scalable Multi-Output Label Prediction: From Classifier Chains to Classifier Trellises

J. Read, L. Martino, P. Olmos et al.

Multi-output inference tasks, such as multi-label classification, have become increasingly important in recent years. A popular method for multi-label classification is classifier chains, in which the predictions of individual classifiers are cascaded along a chain, thus taking into account inter-label dependencies and improving the overall performance. Several varieties of classifier chain methods have been introduced, and many of them perform very competitively across a wide range of benchmark datasets. However, scalability limitations become apparent on larger datasets when modeling a fully-cascaded chain. In particular, the methods' strategies for discovering and modeling a good chain structure constitutes a mayor computational bottleneck. In this paper, we present the classifier trellis (CT) method for scalable multi-label classification. We compare CT with several recently proposed classifier chain methods to show that it occupies an important niche: it is highly competitive on standard multi-label problems, yet it can also scale up to thousands or even tens of thousands of labels.

COAug 17, 2013
Adaptive Independent Sticky MCMC algorithms

L. Martino, R. Casarin, F. Leisen et al.

In this work, we introduce a novel class of adaptive Monte Carlo methods, called adaptive independent sticky MCMC algorithms, for efficient sampling from a generic target probability density function (pdf). The new class of algorithms employs adaptive non-parametric proposal densities which become closer and closer to the target as the number of iterations increases. The proposal pdf is built using interpolation procedures based on a set of support points which is constructed iteratively based on previously drawn samples. The algorithm's efficiency is ensured by a test that controls the evolution of the set of support points. This extra stage controls the computational cost and the convergence of the proposal density to the target. Each part of the novel family of algorithms is discussed and several examples are provided. Although the novel algorithms are presented for univariate target densities, we show that they can be easily extended to the multivariate context within a Gibbs-type sampler. The ergodicity is ensured and discussed. Exhaustive numerical examples illustrate the efficiency of sticky schemes, both as a stand-alone methods to sample from complicated one-dimensional pdfs and within Gibbs in order to draw from multi-dimensional target distributions.