MLJun 20, 2023
A Bayesian Take on Gaussian Process NetworksEnrico Giudice, Jack Kuipers, Giusi Moffa
Gaussian Process Networks (GPNs) are a class of directed graphical models which employ Gaussian processes as priors for the conditional expectation of each variable given its parents in the network. The model allows the description of continuous joint distributions in a compact but flexible manner with minimal parametric assumptions on the dependencies between variables. Bayesian structure learning of GPNs requires computing the posterior over graphs of the network and is computationally infeasible even in low dimensions. This work implements Monte Carlo and Markov Chain Monte Carlo methods to sample from the posterior distribution of network structures. As such, the approach follows the Bayesian paradigm, comparing models via their marginal likelihood and computing the posterior probability of the GPN features. Simulation studies show that our method outperforms state-of-the-art algorithms in recovering the graphical structure of the network and provides an accurate approximation of its posterior distribution.
MLFeb 1, 2024
Bayesian Causal Inference with Gaussian Process NetworksEnrico Giudice, Jack Kuipers, Giusi Moffa
Causal discovery and inference from observational data is an essential problem in statistics posing both modeling and computational challenges. These are typically addressed by imposing strict assumptions on the joint distribution such as linearity. We consider the problem of the Bayesian estimation of the effects of hypothetical interventions in the Gaussian Process Network (GPN) model, a flexible causal framework which allows describing the causal relationships nonparametrically. We detail how to perform causal inference on GPNs by simulating the effect of an intervention across the whole network and propagating the effect of the intervention on downstream variables. We further derive a simpler computational approximation by estimating the intervention distribution as a function of local variables only, modeling the conditional distributions via additive Gaussian processes. We extend both frameworks beyond the case of a known causal graph, incorporating uncertainty about the causal structure via Markov chain Monte Carlo methods. Simulation studies show that our approach is able to identify the effects of hypothetical interventions with non-Gaussian, non-linear observational data and accurately reflect the posterior uncertainty of the causal estimates. Finally we compare the results of our GPN-based causal inference approach to existing methods on a dataset of $A.~thaliana$ gene expressions.
MLDec 16, 2021
The Dual PC Algorithm and the Role of Gaussianity for Structure Learning of Bayesian NetworksEnrico Giudice, Jack Kuipers, Giusi Moffa
Learning the graphical structure of Bayesian networks is key to describing data-generating mechanisms in many complex applications but poses considerable computational challenges. Observational data can only identify the equivalence class of the directed acyclic graph underlying a Bayesian network model, and a variety of methods exist to tackle the problem. Under certain assumptions, the popular PC algorithm can consistently recover the correct equivalence class by reverse-engineering the conditional independence (CI) relationships holding in the variable distribution. The dual PC algorithm is a novel scheme to carry out the CI tests within the PC algorithm by leveraging the inverse relationship between covariance and precision matrices. By exploiting block matrix inversions we can also perform tests on partial correlations of complementary (or dual) conditioning sets. The multiple CI tests of the dual PC algorithm proceed by first considering marginal and full-order CI relationships and progressively moving to central-order ones. Simulation studies show that the dual PC algorithm outperforms the classic PC algorithm both in terms of run time and in recovering the underlying network structure, even in the presence of deviations from Gaussianity. Additionally, we show that the dual PC algorithm applies for Gaussian copula models, and demonstrate its performance in that setting.