LGMar 14, 2018

Learning Latent Superstructures in Variational Autoencoders for Deep Multidimensional Clustering

Xiaopeng Li, Zhourong Chen, Leonard K. M. Poon, Nevin L. Zhang

arXiv:1803.05206v311.758 citations

Originality Highly original

AI Analysis

This work addresses the need for multi-faceted clustering in high-dimensional data, offering a novel approach beyond incremental improvements in deep learning methods.

The authors tackled the problem of single-partition clustering in high-dimensional data by introducing a latent tree variational autoencoder (LTVAE) that learns a superstructure of discrete latent variables, enabling multiple meaningful partitions of data.

We investigate a variant of variational autoencoders where there is a superstructure of discrete latent variables on top of the latent features. In general, our superstructure is a tree structure of multiple super latent variables and it is automatically learned from data. When there is only one latent variable in the superstructure, our model reduces to one that assumes the latent features to be generated from a Gaussian mixture model. We call our model the latent tree variational autoencoder (LTVAE). Whereas previous deep learning methods for clustering produce only one partition of data, LTVAE produces multiple partitions of data, each being given by one super latent variable. This is desirable because high dimensional data usually have many different natural facets and can be meaningfully partitioned in multiple ways.

View on arXiv PDF

Similar