Polina Suter

10.3COMay 2, 2021

Bayesian structure learning and sampling of Bayesian networks with the R package BiDAG

Polina Suter, Jack Kuipers, Giusi Moffa et al.

The R package BiDAG implements Markov chain Monte Carlo (MCMC) methods for structure learning and sampling of Bayesian networks. The package includes tools to search for a maximum a posteriori (MAP) graph and to sample graphs from the posterior distribution given the data. A new hybrid approach to structure learning enables inference in large graphs. In the first step, we define a reduced search space by means of the PC algorithm or based on prior knowledge. In the second step, an iterative order MCMC scheme proceeds to optimize within the restricted search space and estimate the MAP graph. Sampling from the posterior distribution is implemented using either order or partition MCMC. The models and algorithms can handle both discrete and continuous data. The BiDAG package also provides an implementation of MCMC schemes for structure learning and sampling of dynamic Bayesian networks.

13.9MLMar 21, 2018Code

Efficient Sampling and Structure Learning of Bayesian Networks

Jack Kuipers, Polina Suter, Giusi Moffa

Bayesian networks are probabilistic graphical models widely employed to understand dependencies in high dimensional data, and even to facilitate causal discovery. Learning the underlying network structure, which is encoded as a directed acyclic graph (DAG) is highly challenging mainly due to the vast number of possible networks in combination with the acyclicity constraint. Efforts have focussed on two fronts: constraint-based methods that perform conditional independence tests to exclude edges and score and search approaches which explore the DAG space with greedy or MCMC schemes. Here we synthesise these two fields in a novel hybrid method which reduces the complexity of MCMC approaches to that of a constraint-based method. Individual steps in the MCMC scheme only require simple table lookups so that very long chains can be efficiently obtained. Furthermore, the scheme includes an iterative procedure to correct for errors from the conditional independence tests. The algorithm offers markedly superior performance to alternatives, particularly because DAGs can also be sampled from the posterior distribution, enabling full Bayesian model averaging for much larger Bayesian networks.

Polina Suter

2 Papers