LG ITJan 19, 2023

DiME: Maximizing Mutual Information by a Difference of Matrix-Based Entropies

Oscar Skean, Jhoan Keider Hoyos Osorio, Austin J. Brockmeier, Luis Gonzalo Sanchez Giraldo

arXiv:2301.08164v318.019 citationsh-index: 15Has Code

Originality Incremental advance

AI Analysis

This addresses the problem of estimating and maximizing mutual information in machine learning without distributional assumptions, though it appears incremental as it builds on existing matrix-based entropy methods.

The authors introduced DiME, an information-theoretic quantity similar to mutual information that can be estimated without distributional assumptions, and showed it is effective for maximizing mutual information while avoiding trivial solutions, with examples in latent factor disentanglement and multiview representation learning.

We introduce an information-theoretic quantity with similar properties to mutual information that can be estimated from data without making explicit assumptions on the underlying distribution. This quantity is based on a recently proposed matrix-based entropy that uses the eigenvalues of a normalized Gram matrix to compute an estimate of the eigenvalues of an uncentered covariance operator in a reproducing kernel Hilbert space. We show that a difference of matrix-based entropies (DiME) is well suited for problems involving the maximization of mutual information between random variables. While many methods for such tasks can lead to trivial solutions, DiME naturally penalizes such outcomes. We compare DiME to several baseline estimators of mutual information on a toy Gaussian dataset. We provide examples of use cases for DiME, such as latent factor disentanglement and a multiview representation learning problem where DiME is used to learn a shared representation among views with high mutual information.

View on arXiv PDF Code

Similar