A Multivariate Discretization Method for Learning Bayesian Networks from Mixed Data
This work addresses a specific bottleneck in Bayesian network learning for data with mixed variable types, presenting an incremental improvement over existing discretization techniques.
The paper tackles the problem of discretizing continuous variables for learning Bayesian networks from mixed data by introducing a multivariate method that dynamically adjusts discretization based on interactions with other variables and the network structure, using a Bayesian scoring metric.
In this paper we address the problem of discretization in the context of learning Bayesian networks (BNs) from data containing both continuous and discrete variables. We describe a new technique for <EM>multivariate</EM> discretization, whereby each continuous variable is discretized while taking into account its interaction with the other variables. The technique is based on the use of a Bayesian scoring metric that scores the discretization policy for a continuous variable given a BN structure and the observed data. Since the metric is relative to the BN structure currently being evaluated, the discretization of a variable needs to be dynamically adjusted as the BN structure changes.