LGFeb 20, 2023
Free-Form Variational Inference for Gaussian Process State-Space ModelsXuhui Fan, Edwin V. Bonilla, Terence J. O'Kane et al.
Gaussian process state-space models (GPSSMs) provide a principled and flexible approach to modeling the dynamics of a latent state, which is observed at discrete-time points via a likelihood model. However, inference in GPSSMs is computationally and statistically challenging due to the large number of latent variables in the model and the strong temporal dependencies between them. In this paper, we propose a new method for inference in Bayesian GPSSMs, which overcomes the drawbacks of previous approaches, namely over-simplified assumptions, and high computational requirements. Our method is based on free-form variational inference via stochastic gradient Hamiltonian Monte Carlo within the inducing-variable formalism. Furthermore, by exploiting our proposed variational distribution, we provide a collapsed extension of our method where the inducing variables are marginalized analytically. We also showcase results when combining our framework with particle MCMC methods. We show that, on six real-world datasets, our approach can learn transition dynamics and latent states more accurately than competing methods.
MLSep 27, 2024
Positional Encoder Graph Quantile Neural Networks for Geographic DataWilliam E. R. de Amorim, Scott A. Sisson, T. Rodrigues et al.
Positional Encoder Graph Neural Networks (PE-GNNs) are among the most effective models for learning from continuous spatial data. However, their predictive distributions are often poorly calibrated, limiting their utility in applications that require reliable uncertainty quantification. We propose the Positional Encoder Graph Quantile Neural Network (PE-GQNN), a novel framework that combines PE-GNNs with Quantile Neural Networks, partially monotonic neural blocks, and post-hoc recalibration techniques. The PE-GQNN enables flexible and robust conditional density estimation with minimal assumptions about the target distribution, and it extends naturally to tasks beyond spatial data. Empirical results on benchmark datasets show that the PE-GQNN outperforms existing methods in both predictive accuracy and uncertainty quantification, without incurring additional computational cost. We also provide theoretical insights and identify important special cases arising from our formulation, including the PE-GNN.
CODec 13, 2021
An Introduction to Quantum Computing for Statisticians and Data ScientistsAnna Lopatnikova, Minh-Ngoc Tran, Scott A. Sisson
Quantum computers promise to surpass the most powerful classical supercomputers when it comes to solving many critically important practical problems, such as pharmaceutical and fertilizer design, supply chain and traffic optimization, or optimization for machine learning tasks. Because quantum computers function fundamentally differently from classical computers, the emergence of quantum computing technology will lead to a new evolutionary branch of statistical and data analytics methodologies. This review provides an introduction to quantum computing designed to be accessible to statisticians and data scientists, aiming to equip them with an overarching framework of quantum computing, the basic language and building blocks of quantum algorithms, and an overview of existing quantum applications in statistics and data analysis. Our goal is to enable statisticians and data scientists to follow quantum computing literature relevant to their fields, to collaborate with quantum algorithm designers, and, ultimately, to bring forth the next generation of statistical and data analytics tools.
MLFeb 29, 2020
Online Binary Space Partitioning ForestsXuhui Fan, Bin Li, Scott A. Sisson
The Binary Space Partitioning-Tree~(BSP-Tree) process was recently proposed as an efficient strategy for space partitioning tasks. Because it uses more than one dimension to partition the space, the BSP-Tree Process is more efficient and flexible than conventional axis-aligned cutting strategies. However, due to its batch learning setting, it is not well suited to large-scale classification and regression problems. In this paper, we develop an online BSP-Forest framework to address this limitation. With the arrival of new data, the resulting online algorithm can simultaneously expand the space coverage and refine the partition structure, with guaranteed universal consistency for both classification and regression problems. The effectiveness and competitive performance of the online BSP-Forest is verified via simulations on real-world datasets.
MLFeb 26, 2020
Bayesian Nonparametric Space Partitions: A SurveyXuhui Fan, Bin Li, Ling Luo et al.
Bayesian nonparametric space partition (BNSP) models provide a variety of strategies for partitioning a $D$-dimensional space into a set of blocks. In this way, the data points lie in the same block would share certain kinds of homogeneity. BNSP models can be applied to various areas, such as regression/classification trees, random feature construction, relational modeling, etc. In this survey, we investigate the current progress of BNSP research through the following three perspectives: models, which review various strategies for generating the partitions in the space and discuss their theoretical foundation `self-consistency'; applications, which cover the current mainstream usages of BNSP models and their potential future practises; and challenges, which identify the current unsolved problems and valuable future research topics. As there are no comprehensive reviews of BNSP literature before, we hope that this survey can induce further exploration and exploitation on this topic.
MLFeb 25, 2020
Smoothing Graphons for Modelling Exchangeable Relational DataXuhui Fan, Yaqiong Li, Ling Chen et al.
Modelling exchangeable relational data can be described by \textit{graphon theory}. Most Bayesian methods for modelling exchangeable relational data can be attributed to this framework by exploiting different forms of graphons. However, the graphons adopted by existing Bayesian methods are either piecewise-constant functions, which are insufficiently flexible for accurate modelling of the relational data, or are complicated continuous functions, which incur heavy computational costs for inference. In this work, we introduce a smoothing procedure to piecewise-constant graphons to form {\em smoothing graphons}, which permit continuous intensity values for describing relations, but without impractically increasing computational costs. In particular, we focus on the Bayesian Stochastic Block Model (SBM) and demonstrate how to adapt the piecewise-constant SBM graphon to the smoothed version. We initially propose the Integrated Smoothing Graphon (ISG) which introduces one smoothing parameter to the SBM graphon to generate continuous relational intensity values. We then develop the Latent Feature Smoothing Graphon (LFSG), which improves on the ISG by introducing auxiliary hidden labels to decompose the calculation of the ISG intensity and enable efficient inference. Experimental results on real-world data sets validate the advantages of applying smoothing strategies to the Stochastic Block Model, demonstrating that smoothing graphons can greatly improve AUC and precision for link prediction without increasing computational complexity.
LGFeb 24, 2020
Recurrent Dirichlet Belief Networks for Interpretable Dynamic Relational Data ModellingYaqiong Li, Xuhui Fan, Ling Chen et al.
The Dirichlet Belief Network~(DirBN) has been recently proposed as a promising approach in learning interpretable deep latent representations for objects. In this work, we leverage its interpretable modelling architecture and propose a deep dynamic probabilistic framework -- the Recurrent Dirichlet Belief Network~(Recurrent-DBN) -- to study interpretable hidden structures from dynamic relational data. The proposed Recurrent-DBN has the following merits: (1) it infers interpretable and organised hierarchical latent structures for objects within and across time steps; (2) it enables recurrent long-term temporal dependence modelling, which outperforms the one-order Markov descriptions in most of the dynamic probabilistic frameworks. In addition, we develop a new inference strategy, which first upward-and-backward propagates latent counts and then downward-and-forward samples variables, to enable efficient Gibbs sampling for the Recurrent-DBN. We apply the Recurrent-DBN to dynamic relational data problems. The extensive experiment results on real-world data validate the advantages of the Recurrent-DBN over the state-of-the-art models in interpretable latent structure discovery and improved link prediction performance.
MEDec 9, 2019
Logistic regression models for aggregated dataTom Whitaker, Boris Beranger, Scott A. Sisson
Logistic regression models are a popular and effective method to predict the probability of categorical response data. However inference for these models can become computationally prohibitive for large datasets. Here we adapt ideas from symbolic data analysis to summarise the collection of predictor variables into histogram form, and perform inference on this summary dataset. We develop ideas based on composite likelihoods to derive an efficient one-versus-rest approximate composite likelihood model for histogram-based random variables, constructed from low-dimensional marginal histograms obtained from the full histogram. We demonstrate that this procedure can achieve comparable classification rates compared to the standard full data multinomial analysis and against state-of-the-art subsampling algorithms for logistic regression, but at a substantially lower computational cost. Performance is explored through simulated examples, and analyses of large supersymmetry and satellite crop classification datasets.
COSep 11, 2019
Efficient Bayesian synthetic likelihood with whitening transformationsJacob W. Priddle, Scott A. Sisson, David T. Frazier et al.
Likelihood-free methods are an established approach for performing approximate Bayesian inference for models with intractable likelihood functions. However, they can be computationally demanding. Bayesian synthetic likelihood (BSL) is a popular such method that approximates the likelihood function of the summary statistic with a known, tractable distribution -- typically Gaussian -- and then performs statistical inference using standard likelihood-based techniques. However, as the number of summary statistics grows, the number of model simulations required to accurately estimate the covariance matrix for this likelihood rapidly increases. This poses significant challenge for the application of BSL, especially in cases where model simulation is expensive. In this article we propose whitening BSL (wBSL) -- an efficient BSL method that uses approximate whitening transformations to decorrelate the summary statistics at each algorithm iteration. We show empirically that this can reduce the number of model simulations required to implement BSL by more than an order of magnitude, without much loss of accuracy. We explore a range of whitening procedures and demonstrate the performance of wBSL on a range of simulated and real modelling scenarios from ecology and biology.
COFeb 25, 2019
Vector operations for accelerating expensive Bayesian computations -- a tutorial guideDavid J. Warne, Scott A. Sisson, Christopher Drovandi
Many applications in Bayesian statistics are extremely computationally intensive. However, they are often inherently parallel, making them prime targets for modern massively parallel processors. Multi-core and distributed computing is widely applied in the Bayesian community, however, very little attention has been given to fine-grain parallelisation using single instruction multiple data (SIMD) operations that are available on most modern commodity CPUs and is the basis of GPGPU computing. In this work, we practically demonstrate, using standard programming libraries, the utility of the SIMD approach for several topical Bayesian applications. We show that SIMD can improve the floating point arithmetic performance resulting in up to $6\times$ improvement in serial algorithm performance. Importantly, these improvements are multiplicative to any gains achieved through multi-core processing. We illustrate the potential of SIMD for accelerating Bayesian computations and provide the reader with techniques for exploiting modern massively parallel processing environments using standard tools.
MLSep 27, 2018
Variance reduction properties of the reparameterization trickMing Xu, Matias Quiroz, Robert Kohn et al.
The reparameterization trick is widely used in variational inference as it yields more accurate estimates of the gradient of the variational objective than alternative approaches such as the score function method. Although there is overwhelming empirical evidence in the literature showing its success, there is relatively little research exploring why the reparameterization trick is so effective. We explore this under the idealized assumptions that the variational approximation is a mean-field Gaussian density and that the log of the joint density of the model parameters and the data is a quadratic function that depends on the variational mean. From this, we show that the marginal variances of the reparameterization gradient estimator are smaller than those of the score function gradient estimator. We apply the result of our idealized analysis to real-world examples.
COSep 11, 2018
New models for symbolic data analysisBoris Beranger, Huan Lin, Scott A. Sisson
Symbolic data analysis (SDA) is an emerging area of statistics concerned with understanding and modelling data that takes distributional form (i.e. symbols), such as random lists, intervals and histograms. It was developed under the premise that the statistical unit of interest is the symbol, and that inference is required at this level. Here we consider a different perspective, which opens a new research direction in the field of SDA. We assume that, as with a standard statistical analysis, inference is required at the level of individual-level data. However, the individual-level data are aggregated into symbols - group-based distributional-valued summaries - prior to the analysis. In this way, large and complex datasets can be reduced to a smaller number of distributional summaries, that may be analysed more efficiently than the original dataset. As such, we develop SDA techniques as a new approach for the analysis of big data. In particular we introduce a new general method for constructing likelihood functions for symbolic data based on a desired probability model for the underlying measurement-level data, while only observing the distributional summaries. This approach opens the door for new classes of symbol design and construction, in addition to developing SDA as a viable tool to enable and improve upon classical data analyses, particularly for very large and complex datasets. We illustrate this new direction for SDA research through several real and simulated data analyses.
COAug 2, 2016
Blocking Collapsed Gibbs Sampler for Latent Dirichlet Allocation ModelsXin Zhang, Scott A. Sisson
The latent Dirichlet allocation (LDA) model is a widely-used latent variable model in machine learning for text analysis. Inference for this model typically involves a single-site collapsed Gibbs sampling step for latent variables associated with observations. The efficiency of the sampling is critical to the success of the model in practical large scale applications. In this article, we introduce a blocking scheme to the collapsed Gibbs sampler for the LDA model which can, with a theoretical guarantee, improve chain mixing efficiency. We develop two procedures, an O(K)-step backward simulation and an O(log K)-step nested simulation, to directly sample the latent variables within each block. We demonstrate that the blocking scheme achieves substantial improvements in chain mixing compared to the state of the art single-site collapsed Gibbs sampler. We also show that when the number of topics is over hundreds, the nested-simulation blocking scheme can achieve a significant reduction in computation time compared to the single-site sampler.