LG AI MLJan 10, 2013

Multivariate Information Bottleneck

Nir Friedman, Ori Mosenzon, Noam Slonim, Naftali Tishby

arXiv:1301.2270v1226 citations

Originality Incremental advance

AI Analysis

This work addresses the need for more complex data organization in fields like document classification and gene expression, though it appears incremental as it builds on an existing method.

The paper tackles the problem of extending the information bottleneck method to multivariate settings, introducing a principled framework that uses Bayesian networks to specify interrelated data partitions and provides algorithms for constructing solutions.

The Information bottleneck method is an unsupervised non-parametric data organization technique. Given a joint distribution P(A,B), this method constructs a new variable T that extracts partitions, or clusters, over the values of A that are informative about B. The information bottleneck has already been applied to document classification, gene expression, neural code, and spectral analysis. In this paper, we introduce a general principled framework for multivariate extensions of the information bottleneck method. This allows us to consider multiple systems of data partitions that are inter-related. Our approach utilizes Bayesian networks for specifying the systems of clusters and what information each captures. We show that this construction provides insight about bottleneck variations and enables us to characterize solutions of these variations. We also present a general framework for iterative algorithms for constructing solutions, and apply it to several examples.

View on arXiv PDF

Similar