Solving Bayesian Network Structure Learning Problem with Integer Linear Programming
This work addresses the computational challenge of learning Bayesian network structures, which is incremental as it builds on existing ILP methods with specific constraints and heuristics.
The dissertation tackles the Bayesian Network structure learning problem by formulating it as an integer linear programming (ILP) problem, incorporating cluster constraints for acyclicity and cutting planes to manage exponential constraints, with results presented from experiments on reference datasets.
This dissertation investigates integer linear programming (ILP) formulation of Bayesian Network structure learning problem. We review the definition and key properties of Bayesian network and explain score metrics used to measure how well certain Bayesian network structure fits the dataset. We outline the integer linear programming formulation based on the decomposability of score metrics. In order to ensure acyclicity of the structure, we add ``cluster constraints'' developed specifically for Bayesian network, in addition to cycle constraints applicable to directed acyclic graphs in general. Since there would be exponential number of these constraints if we specify them fully, we explain the methods to add them as cutting planes without declaring them all in the initial model. Also, we develop a heuristic algorithm that finds a feasible solution based on the idea of sink node on directed acyclic graphs. We implemented the ILP formulation and cutting planes as a \textsf{Python} package, and present the results of experiments with different settings on reference datasets.