LG AIDec 4, 2023

Divide-and-Conquer Strategy for Large-Scale Dynamic Bayesian Network Structure Learning

arXiv:2312.01739v12.01 citationsh-index: 1IIP

Originality Incremental advance

AI Analysis

This work addresses a scalability bottleneck in structure learning for Dynamic Bayesian Networks, which is crucial for applications like gene expression analysis and healthcare, but it is incremental as it adapts an existing strategy to a specific class of networks.

The paper tackles the challenge of learning large-scale Dynamic Bayesian Network structures, particularly for datasets with thousands of variables, by introducing a novel divide-and-conquer strategy adapted from static Bayesian Networks. The method improves scalability and accuracy, achieving average improvements of 74.45% and 110.94% in accuracy metrics and a 93.65% reduction in runtime on instances with over 1,000 variables.

Dynamic Bayesian Networks (DBNs), renowned for their interpretability, have become increasingly vital in representing complex stochastic processes in various domains such as gene expression analysis, healthcare, and traffic prediction. Structure learning of DBNs from data is challenging, particularly for datasets with thousands of variables. Most current algorithms for DBN structure learning are adaptations from those used in static Bayesian Networks (BNs), and are typically focused on small-scale problems. In order to solve large-scale problems while taking full advantage of existing algorithms, this paper introduces a novel divide-and-conquer strategy, originally developed for static BNs, and adapts it for large-scale DBN structure learning. In this work, we specifically concentrate on 2 Time-sliced Bayesian Networks (2-TBNs), a special class of DBNs. Furthermore, we leverage the prior knowledge of 2-TBNs to enhance the performance of the strategy we introduce. Our approach significantly improves the scalability and accuracy of 2-TBN structure learning. Experimental results demonstrate the effectiveness of our method, showing substantial improvements over existing algorithms in both computational efficiency and structure learning accuracy. On problem instances with more than 1,000 variables, our approach improves two accuracy metrics by 74.45% and 110.94% on average , respectively, while reducing runtime by 93.65% on average.

View on arXiv PDF

Similar