LGAIMLDec 16, 2018

Deep Clustering Based on a Mixture of Autoencoders

arXiv:1812.06535v239 citations
Originality Incremental advance
AI Analysis

This addresses the problem of unsupervised clustering for researchers and practitioners in machine learning, offering a novel method that avoids common pitfalls like data collapse, though it appears incremental as it builds on existing deep clustering and autoencoder techniques.

The paper tackles unsupervised clustering by proposing DAMIC, a deep clustering algorithm using a mixture of autoencoders, which jointly learns data representations and clusters by minimizing reconstruction loss without needing regularization to prevent collapse. Experimental results on image and text data show significant improvements over state-of-the-art methods.

In this paper we propose a Deep Autoencoder MIxture Clustering (DAMIC) algorithm based on a mixture of deep autoencoders where each cluster is represented by an autoencoder. A clustering network transforms the data into another space and then selects one of the clusters. Next, the autoencoder associated with this cluster is used to reconstruct the data-point. The clustering algorithm jointly learns the nonlinear data representation and the set of autoencoders. The optimal clustering is found by minimizing the reconstruction loss of the mixture of autoencoder network. Unlike other deep clustering algorithms, no regularization term is needed to avoid data collapsing to a single point. Our experimental evaluations on image and text corpora show significant improvement over state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes