Exact Structure Discovery in Bayesian Networks with Less Space
This work addresses the scalability issue for researchers and practitioners in machine learning and AI who need to discover optimal Bayesian network structures for larger networks, offering incremental improvements in space efficiency.
The paper tackles the problem of exact structure discovery in Bayesian networks, which is limited by high space requirements, by introducing space-time tradeoffs that reduce space usage while maintaining or improving time complexity, achieving time 2^(2n-s)n^O(1) in space 2^s n^O(1) for bounded indegree and time 2^(n(3/2)^p)n^O(1) in space 2^(n(3/4)^p)n^O(1) for indegrees up to 0.238n.
The fastest known exact algorithms for scorebased structure discovery in Bayesian networks on n nodes run in time and space 2nnO(1). The usage of these algorithms is limited to networks on at most around 25 nodes mainly due to the space requirement. Here, we study space-time tradeoffs for finding an optimal network structure. When little space is available, we apply the Gurevich-Shelah recurrence-originally proposed for the Hamiltonian path problem-and obtain time 22n-snO(1) in space 2snO(1) for any s = n/2, n/4, n/8, . . .; we assume the indegree of each node is bounded by a constant. For the more practical setting with moderate amounts of space, we present a novel scheme. It yields running time 2n(3/2)pnO(1) in space 2n(3/4)pnO(1) for any p = 0, 1, . . ., n/2; these bounds hold as long as the indegrees are at most 0.238n. Furthermore, the latter scheme allows easy and efficient parallelization beyond previous algorithms. We also explore empirically the potential of the presented techniques.