LGMLMay 16, 2019

On Variational Bounds of Mutual Information

arXiv:1905.06922v11007 citations
Originality Incremental advance
AI Analysis

This work addresses a core problem in machine learning for researchers and practitioners dealing with high-dimensional data, but it is incremental as it builds on and refines existing variational bounds.

The paper tackles the challenge of estimating and optimizing Mutual Information (MI) in high dimensions by unifying existing variational bounds into a single framework, finding that they degrade with large MI, and introducing a continuum of lower bounds that trade off bias and variance, demonstrating effectiveness on controlled problems.

Estimating and optimizing Mutual Information (MI) is core to many problems in machine learning; however, bounding MI in high dimensions is challenging. To establish tractable and scalable objectives, recent work has turned to variational bounds parameterized by neural networks, but the relationships and tradeoffs between these bounds remains unclear. In this work, we unify these recent developments in a single framework. We find that the existing variational lower bounds degrade when the MI is large, exhibiting either high bias or high variance. To address this problem, we introduce a continuum of lower bounds that encompasses previous bounds and flexibly trades off bias and variance. On high-dimensional, controlled problems, we empirically characterize the bias and variance of the bounds and their gradients and demonstrate the effectiveness of our new bounds for estimation and representation learning.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes