LGSPMLJan 6, 2021

Representation learning for maximization of MI, nonlinear ICA and nonlinear subspaces with robust density ratio estimation

arXiv:2101.02083v26 citations
AI Analysis

This work provides a theoretical foundation for understanding contrastive learning and offers robust methods for nonlinear ICA and subspace estimation, which could benefit researchers in unsupervised representation learning.

This paper theoretically connects contrastive learning to mutual information (MI) maximization, showing that density ratio estimation is necessary and sufficient for MI maximization under certain conditions. It then establishes new recovery conditions for nonlinear independent component analysis (ICA) using density ratios, including a novel insight into data dimensionality, and proposes a robust framework for nonlinear subspace estimation.

Contrastive learning is a recent promising approach in unsupervised representation learning where a feature representation of data is learned by solving a pseudo classification problem from unlabelled data. However, it is not straightforward to understand what representation contrastive learning yields. In addition, contrastive learning is often based on the maximum likelihood estimation, which tends to be vulnerable to the contamination by outliers. To promote the understanding to contrastive learning, this paper first theoretically shows a connection to maximization of mutual information (MI). Our result indicates that density ratio estimation is necessary and sufficient for maximization of MI under some conditions. Thus, contrastive learning related to density ratio estimation as done in popular objective functions can be interpreted as maximizing MI. Next, with the density ratio, we establish new recovery conditions for the latent source components in nonlinear independent component analysis (ICA). In contrast with existing work, the established conditions include a novel insight for the dimensionality of data, which is clearly supported by numerical experiments. Furthermore, inspired by nonlinear ICA, we propose a novel framework to estimate a nonlinear subspace for lower-dimensional latent source components, and some theoretical conditions for the subspace estimation are established with the density ratio. Then, we propose a practical method through outlier-robust density ratio estimation, which can be seen as performing maximization of MI, nonlinear ICA or nonlinear subspace estimation. Moreover, a sample-efficient nonlinear ICA method is also proposed. We theoretically investigate outlier-robustness of the proposed methods. Finally, the usefulness of the proposed methods is numerically demonstrated in nonlinear ICA and through application to linear classification.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes