Deep Goal-Oriented Clustering
This addresses the challenge of integrating clustering and prediction for machine learning practitioners, though it is incremental as it builds on existing relationships between these tasks.
The paper tackles the problem of leveraging supervision to improve clustering by introducing Deep Goal-Oriented Clustering (DGC), a probabilistic framework that jointly uses side-information and unsupervised modeling to achieve prediction accuracies comparable to state-of-the-art while learning congruent clustering strategies.
Clustering and prediction are two primary tasks in the fields of unsupervised and supervised learning, respectively. Although much of the recent advances in machine learning have been centered around those two tasks, the interdependent, mutually beneficial relationship between them is rarely explored. One could reasonably expect appropriately clustering the data would aid the downstream prediction task and, conversely, a better prediction performance for the downstream task could potentially inform a more appropriate clustering strategy. In this work, we focus on the latter part of this mutually beneficial relationship. To this end, we introduce Deep Goal-Oriented Clustering (DGC), a probabilistic framework that clusters the data by jointly using supervision via side-information and unsupervised modeling of the inherent data structure in an end-to-end fashion. We show the effectiveness of our model on a range of datasets by achieving prediction accuracies comparable to the state-of-the-art, while, more importantly in our setting, simultaneously learning congruent clustering strategies.