Generalized Naive Bayes
This work addresses a domain-specific problem in machine learning for improving probabilistic modeling and feature selection, representing an incremental advancement.
The authors tackled the problem of extending Naive Bayes to a more flexible structure called Generalized Naive Bayes (GNB), introducing greedy and optimal algorithms that fit data at least as well as classical Naive Bayes and outperform related algorithms in many cases.
In this paper we introduce the so-called Generalized Naive Bayes structure as an extension of the Naive Bayes structure. We give a new greedy algorithm that finds a good fitting Generalized Naive Bayes (GNB) probability distribution. We prove that this fits the data at least as well as the probability distribution determined by the classical Naive Bayes (NB). Then, under a not very restrictive condition, we give a second algorithm for which we can prove that it finds the optimal GNB probability distribution, i.e. best fitting structure in the sense of KL divergence. Both algorithms are constructed to maximize the information content and aim to minimize redundancy. Based on these algorithms, new methods for feature selection are introduced. We discuss the similarities and differences to other related algorithms in terms of structure, methodology, and complexity. Experimental results show, that the algorithms introduced outperform the related algorithms in many cases.