Deep Amortized Clustering
This addresses clustering efficiency for data scientists by meta-learning from labeled datasets, but it is incremental as it builds on existing neural methods for clustering.
The authors tackled the problem of clustering datasets efficiently by proposing Deep Amortized Clustering (DAC), a neural architecture that learns to cluster using a few forward passes, and they showed it can accurately cluster new datasets from the same distribution as training data.
We propose a deep amortized clustering (DAC), a neural architecture which learns to cluster datasets efficiently using a few forward passes. DAC implicitly learns what makes a cluster, how to group data points into clusters, and how to count the number of clusters in datasets. DAC is meta-learned using labelled datasets for training, a process distinct from traditional clustering algorithms which usually require hand-specified prior knowledge about cluster shapes/structures. We empirically show, on both synthetic and image data, that DAC can efficiently and accurately cluster new datasets coming from the same distribution used to generate training datasets.