Scalable Bayesian Network Structure Learning with Splines
This addresses the challenge of accurate and scalable Bayesian network learning for data analysis, representing an incremental advancement.
The paper tackles the problem of learning Bayesian network structures by simultaneously modeling linear and non-linear relationships, achieving improved accuracy and scalability on benchmark datasets.
The graph structure of a Bayesian network (BN) can be learned from data using the well-known score-and-search approach. Previous work has shown that incorporating structured representations of the conditional probability distributions (CPDs) into the score-and-search approach can improve the accuracy of the learned graph. In this paper, we present a novel approach capable of learning the graph of a BN and simultaneously modelling linear and non-linear local probabilistic relationships between variables. We achieve this by a combination of feature selection to reduce the search space for local relationships and extending the score-and-search approach to incorporate modelling the CPDs over variables as Multivariate Adaptive Regression Splines (MARS). MARS are polynomial regression models represented as piecewise spline functions. We show on a set of discrete and continuous benchmark instances that our proposed approach can improve the accuracy of the learned graph while scaling to instances with a large number of variables.