LGMLMay 29, 2025

Efficient Parameter Estimation for Bayesian Network Classifiers using Hierarchical Linear Smoothing

arXiv:2505.23320v1h-index: 1
Originality Incremental advance
AI Analysis

This work addresses the performance and flexibility issues in Bayesian network classifiers for categorical data, offering an incremental improvement over existing smoothing techniques.

The paper tackles the problem of parameter estimation for Bayesian network classifiers, which have historically underperformed compared to methods like random forests, by introducing a log-linear regression method that approximates hierarchical Dirichlet processes. The result is a method that outperforms HDP smoothing in speed by orders of magnitude while remaining competitive with random forests on categorical data.

Bayesian network classifiers (BNCs) possess a number of properties desirable for a modern classifier: They are easily interpretable, highly scalable, and offer adaptable complexity. However, traditional methods for learning BNCs have historically underperformed when compared to leading classification methods such as random forests. Recent parameter smoothing techniques using hierarchical Dirichlet processes (HDPs) have enabled BNCs to achieve performance competitive with random forests on categorical data, but these techniques are relatively inflexible, and require a complicated, specialized sampling process. In this paper, we introduce a novel method for parameter estimation that uses a log-linear regression to approximate the behaviour of HDPs. As a linear model, our method is remarkably flexible and simple to interpret, and can leverage the vast literature on learning linear models. Our experiments show that our method can outperform HDP smoothing while being orders of magnitude faster, remaining competitive with random forests on categorical data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes