MLLGJun 24, 2021

MIxBN: library for learning Bayesian networks from mixed data

arXiv:2106.13194v17 citations
Originality Incremental advance
AI Analysis

This work addresses a practical need for handling mixed data in Bayesian network learning, but it is incremental as it builds on existing methods with specific adaptations.

The authors tackled the problem of learning Bayesian networks from mixed discrete and continuous data without discretization, which causes information loss, by developing a library with a novel algorithm using a mixed MI score function and Gaussian approximation, achieving unspecified performance improvements on synthetic and real datasets.

This paper describes a new library for learning Bayesian networks from data containing discrete and continuous variables (mixed data). In addition to the classical learning methods on discretized data, this library proposes its algorithm that allows structural learning and parameters learning from mixed data without discretization since data discretization leads to information loss. This algorithm based on mixed MI score function for structural learning, and also linear regression and Gaussian distribution approximation for parameters learning. The library also offers two algorithms for enumerating graph structures - the greedy Hill-Climbing algorithm and the evolutionary algorithm. Thus the key capabilities of the proposed library are as follows: (1) structural and parameters learning of a Bayesian network on discretized data, (2) structural and parameters learning of a Bayesian network on mixed data using the MI mixed score function and Gaussian approximation, (3) launching learning algorithms on one of two algorithms for enumerating graph structures - Hill-Climbing and the evolutionary algorithm. Since the need for mixed data representation comes from practical necessity, the advantages of our implementations are evaluated in the context of solving approximation and gap recovery problems on synthetic data and real datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes