MLMay 15, 2025
A Scalable Gradient-Based Optimization Framework for Sparse Minimum-Variance Portfolio SelectionSarat Moka, Matias Quiroz, Vali Asimit et al.
Portfolio optimization involves selecting asset weights to minimize a risk-reward objective, such as the portfolio variance in the classical minimum-variance framework. Sparse portfolio selection extends this by imposing a cardinality constraint: only $k$ assets from a universe of $p$ may be included. The standard approach models this problem as a mixed-integer quadratic program and relies on commercial solvers to find the optimal solution. However, the computational costs of such methods increase exponentially with $k$ and $p$, making them too slow for problems of even moderate size. We propose a fast and scalable gradient-based approach that transforms the combinatorial sparse selection problem into a constrained continuous optimization task via Boolean relaxation, while preserving equivalence with the original problem on the set of binary points. Our algorithm employs a tunable parameter that transmutes the auxiliary objective from a convex to a concave function. This allows a stable convex starting point, followed by a controlled path toward a sparse binary solution as the tuning parameter increases and the objective moves toward concavity. In practice, our method matches commercial solvers in asset selection for most instances and, in rare instances, the solution differs by a few assets whilst showing a negligible error in portfolio variance.
MLSep 27, 2018
Variance reduction properties of the reparameterization trickMing Xu, Matias Quiroz, Robert Kohn et al.
The reparameterization trick is widely used in variational inference as it yields more accurate estimates of the gradient of the variational objective than alternative approaches such as the score function method. Although there is overwhelming empirical evidence in the literature showing its success, there is relatively little research exploring why the reparameterization trick is so effective. We explore this under the idealized assumptions that the variational approximation is a mean-field Gaussian density and that the log of the joint density of the model parameters and the data is a quadratic function that depends on the variational mean. From this, we show that the marginal variances of the reparameterization gradient estimator are smaller than those of the score function gradient estimator. We apply the result of our idealized analysis to real-world examples.
MEJul 23, 2018
Subsampling MCMC - An introduction for the survey statisticianMatias Quiroz, Mattias Villani, Robert Kohn et al.
The rapid development of computing power and efficient Markov Chain Monte Carlo (MCMC) simulation algorithms have revolutionized Bayesian statistics, making it a highly practical inference method in applied work. However, MCMC algorithms tend to be computationally demanding, and are particularly slow for large datasets. Data subsampling has recently been suggested as a way to make MCMC methods scalable on massively large data, utilizing efficient sampling schemes and estimators from the survey sampling literature. These developments tend to be unknown by many survey statisticians who traditionally work with non-Bayesian methods, and rarely use MCMC. Our article explains the idea of data subsampling in MCMC by reviewing one strand of work, Subsampling MCMC, a so called pseudo-marginal MCMC approach to speeding up MCMC through data subsampling. The review is written for a survey statistician without previous knowledge of MCMC methods since our aim is to motivate survey sampling experts to contribute to the growing Subsampling MCMC literature.
COMay 8, 2018
Subsampling Sequential Monte Carlo for Static Bayesian ModelsDavid Gunawan, Khue-Dung Dang, Matias Quiroz et al.
We show how to speed up Sequential Monte Carlo (SMC) for Bayesian inference in large data problems by data subsampling. SMC sequentially updates a cloud of particles through a sequence of distributions, beginning with a distribution that is easy to sample from such as the prior and ending with the posterior distribution. Each update of the particle cloud consists of three steps: reweighting, resampling, and moving. In the move step, each particle is moved using a Markov kernel; this is typically the most computationally expensive part, particularly when the dataset is large. It is crucial to have an efficient move step to ensure particle diversity. Our article makes two important contributions. First, in order to speed up the SMC computation, we use an approximately unbiased and efficient annealed likelihood estimator based on data subsampling. The subsampling approach is more memory efficient than the corresponding full data SMC, which is an advantage for parallel computation. Second, we use a Metropolis within Gibbs kernel with two conditional updates. A Hamiltonian Monte Carlo update makes distant moves for the model parameters, and a block pseudo-marginal proposal is used for the particles corresponding to the auxiliary variables for the data subsampling. We demonstrate both the usefulness and limitations of the methodology for estimating four generalized linear models and a generalized additive model with large datasets.
MEJan 24, 2018
Gaussian variational approximation for high-dimensional state space modelsMatias Quiroz, David J. Nott, Robert Kohn
Our article considers a Gaussian variational approximation of the posterior density in a high-dimensional state space model. The variational parameters to be optimized are the mean vector and the covariance matrix of the approximation. The number of parameters in the covariance matrix grows as the square of the number of model parameters, so it is necessary to find simple yet effective parameterizations of the covariance structure when the number of model parameters is large. We approximate the joint posterior distribution over the high-dimensional state vectors by a dynamic factor model, having Markovian time dependence and a factor covariance structure for the states. This gives a reduced description of the dependence structure for the states, as well as a temporal conditional independence structure similar to that in the true posterior. The usefulness of the approach is illustrated for prediction in two high-dimensional applications that are challenging for Markov chain Monte Carlo sampling. The first is a spatio-temporal model for the spread of the Eurasian Collared-Dove across North America; the second is a Wishart-based multivariate stochastic volatility model for financial returns.
COAug 2, 2017
Hamiltonian Monte Carlo with Energy Conserving SubsamplingKhue-Dung Dang, Matias Quiroz, Robert Kohn et al.
Hamiltonian Monte Carlo (HMC) samples efficiently from high-dimensional posterior distributions with proposed parameter draws obtained by iterating on a discretized version of the Hamiltonian dynamics. The iterations make HMC computationally costly, especially in problems with large datasets, since it is necessary to compute posterior densities and their derivatives with respect to the parameters. Naively computing the Hamiltonian dynamics on a subset of the data causes HMC to lose its key ability to generate distant parameter proposals with high acceptance probability. The key insight in our article is that efficient subsampling HMC for the parameters is possible if both the dynamics and the acceptance probability are computed from the same data subsample in each complete HMC iteration. We show that this is possible to do in a principled way in a HMC-within-Gibbs framework where the subsample is updated using a pseudo marginal MH step and the parameters are then updated using an HMC step, based on the current subsample. We show that our subsampling methods are fast and compare favorably to two popular sampling algorithms that utilize gradient estimates from data subsampling. We also explore the current limitations of subsampling HMC algorithms by varying the quality of the variance reducing control variates used in the estimators of the posterior density and its gradients.
COMar 27, 2016
The block-Poisson estimator for optimally tuned exact subsampling MCMCMatias Quiroz, Minh-Ngoc Tran, Mattias Villani et al.
Speeding up Markov Chain Monte Carlo (MCMC) for datasets with many observations by data subsampling has recently received considerable attention. A pseudo-marginal MCMC method is proposed that estimates the likelihood by data subsampling using a block-Poisson estimator. The estimator is a product of Poisson estimators, allowing us to update a single block of subsample indicators in each MCMC iteration so that a desired correlation is achieved between the logs of successive likelihood estimates. This is important since pseudo-marginal MCMC with positively correlated likelihood estimates can use substantially smaller subsamples without adversely affecting the sampling efficiency. The block-Poisson estimator is unbiased but not necessarily positive, so the algorithm runs the MCMC on the absolute value of the likelihood estimator and uses an importance sampling correction to obtain consistent estimates of the posterior mean of any function of the parameters. Our article derives guidelines to select the optimal tuning parameters for our method and shows that it compares very favourably to regular MCMC without subsampling, and to two other recently proposed exact subsampling approaches in the literature.
MEJul 10, 2015
Scalable MCMC for Large Data Problems using Data Subsampling and the Difference EstimatorMatias Quiroz, Mattias Villani, Robert Kohn
We propose a generic Markov Chain Monte Carlo (MCMC) algorithm to speed up computations for datasets with many observations. A key feature of our approach is the use of the highly efficient difference estimator from the survey sampling literature to estimate the log-likelihood accurately using only a small fraction of the data. Our algorithm improves on the $O(n)$ complexity of regular MCMC by operating over local data clusters instead of the full sample when computing the likelihood. The likelihood estimate is used in a Pseudo-marginal framework to sample from a perturbed posterior which is within $O(m^{-1/2})$ of the true posterior, where $m$ is the subsample size. The method is applied to a logistic regression model to predict firm bankruptcy for a large data set. We document a significant speed up in comparison to the standard MCMC on the full dataset.
MEApr 16, 2014
Speeding Up MCMC by Efficient Data SubsamplingMatias Quiroz, Robert Kohn, Mattias Villani et al.
We propose Subsampling MCMC, a Markov Chain Monte Carlo (MCMC) framework where the likelihood function for $n$ observations is estimated from a random subset of $m$ observations. We introduce a highly efficient unbiased estimator of the log-likelihood based on control variates, such that the computing cost is much smaller than that of the full log-likelihood in standard MCMC. The likelihood estimate is bias-corrected and used in two dependent pseudo-marginal algorithms to sample from a perturbed posterior, for which we derive the asymptotic error with respect to $n$ and $m$, respectively. We propose a practical estimator of the error and show that the error is negligible even for a very small $m$ in our applications. We demonstrate that Subsampling MCMC is substantially more efficient than standard MCMC in terms of sampling efficiency for a given computational budget, and that it outperforms other subsampling methods for MCMC proposed in the literature.