ITLGMLMay 6, 2017

Nonlinear Information Bottleneck

arXiv:1705.02436v9182 citations
Originality Incremental advance
AI Analysis

This addresses a bottleneck in machine learning for researchers and practitioners needing flexible IB applications, though it is incremental as it builds on prior IB methods.

The paper tackles the problem of applying the Information Bottleneck (IB) technique to arbitrarily-distributed discrete and/or continuous variables with nonlinear maps, proposing a method based on a novel non-parametric mutual information upper bound and neural networks, achieving better performance than variational IB on real-world datasets.

Information bottleneck (IB) is a technique for extracting information in one random variable $X$ that is relevant for predicting another random variable $Y$. IB works by encoding $X$ in a compressed "bottleneck" random variable $M$ from which $Y$ can be accurately decoded. However, finding the optimal bottleneck variable involves a difficult optimization problem, which until recently has been considered for only two limited cases: discrete $X$ and $Y$ with small state spaces, and continuous $X$ and $Y$ with a Gaussian joint distribution (in which case optimal encoding and decoding maps are linear). We propose a method for performing IB on arbitrarily-distributed discrete and/or continuous $X$ and $Y$, while allowing for nonlinear encoding and decoding maps. Our approach relies on a novel non-parametric upper bound for mutual information. We describe how to implement our method using neural networks. We then show that it achieves better performance than the recently-proposed "variational IB" method on several real-world datasets.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes