AIAug 7, 2014

Robust Feature Selection by Mutual Information Distributions

arXiv:1408.1487v1127 citations

Originality Incremental advance

AI Analysis

This work addresses the reliability of feature selection for incremental learning and classification, particularly in naive Bayes classifiers, offering an incremental improvement over existing methods.

The paper tackles the problem of robust feature selection by deriving the distribution of mutual information in a Bayesian framework, providing exact mean and approximate variance expressions, and applying these results to outperform traditional empirical mutual information methods on real datasets with a fast new method.

Mutual information is widely used in artificial intelligence, in a descriptive way, to measure the stochastic dependence of discrete random variables. In order to address questions such as the reliability of the empirical value, one must consider sample-to-population inferential approaches. This paper deals with the distribution of mutual information, as obtained in a Bayesian framework by a second-order Dirichlet prior distribution. The exact analytical expression for the mean and an analytical approximation of the variance are reported. Asymptotic approximations of the distribution are proposed. The results are applied to the problem of selecting features for incremental learning and classification of the naive Bayes classifier. A fast, newly defined method is shown to outperform the traditional approach based on empirical mutual information on a number of real data sets. Finally, a theoretical development is reported that allows one to efficiently extend the above methods to incomplete samples in an easy and effective way.

View on arXiv PDF

Similar