LG CVJan 24, 2022

Neural Manifold Clustering and Embedding

Zengyi Li, Yubei Chen, Yann LeCun, Friedrich T. Sommer

arXiv:2201.10000v121.952 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of manifold clustering in machine learning, offering a general-purpose method that improves performance in clustering tasks, though it appears incremental as it builds on existing self-supervised learning ideas.

The paper tackles the problem of clustering data points based on non-linear manifold structures by proposing Neural Manifold Clustering and Embedding (NMCE), which uses data augmentation constraints and a Maximum Coding Rate Reduction objective to outperform autoencoder-based deep subspace clustering and other specialized algorithms on natural image datasets.

Given a union of non-linear manifolds, non-linear subspace clustering or manifold clustering aims to cluster data points based on manifold structures and also learn to parameterize each manifold as a linear subspace in a feature space. Deep neural networks have the potential to achieve this goal under highly non-linear settings given their large capacity and flexibility. We argue that achieving manifold clustering with neural networks requires two essential ingredients: a domain-specific constraint that ensures the identification of the manifolds, and a learning algorithm for embedding each manifold to a linear subspace in the feature space. This work shows that many constraints can be implemented by data augmentation. For subspace feature learning, Maximum Coding Rate Reduction (MCR$^2$) objective can be used. Putting them together yields {\em Neural Manifold Clustering and Embedding} (NMCE), a novel method for general purpose manifold clustering, which significantly outperforms autoencoder-based deep subspace clustering. Further, on more challenging natural image datasets, NMCE can also outperform other algorithms specifically designed for clustering. Qualitatively, we demonstrate that NMCE learns a meaningful and interpretable feature space. As the formulation of NMCE is closely related to several important Self-supervised learning (SSL) methods, we believe this work can help us build a deeper understanding on SSL representation learning.

View on arXiv PDF Code

Similar