Clustering-Oriented Representation Learning with Attractive-Repulsive Loss
This work addresses the need for more interpretable and clusterable representations in neural networks, though it is incremental as it builds on existing loss frameworks.
The authors tackled the problem of building useful latent representations by proposing clustering-oriented representation learning (COREL) as an alternative to categorical cross-entropy, resulting in variants that outperform or match CCE in classification tasks like image and news article classification, with Gaussian-COREL achieving better accuracy.
The standard loss function used to train neural network classifiers, categorical cross-entropy (CCE), seeks to maximize accuracy on the training data; building useful representations is not a necessary byproduct of this objective. In this work, we propose clustering-oriented representation learning (COREL) as an alternative to CCE in the context of a generalized attractive-repulsive loss framework. COREL has the consequence of building latent representations that collectively exhibit the quality of natural clustering within the latent space of the final hidden layer, according to a predefined similarity function. Despite being simple to implement, COREL variants outperform or perform equivalently to CCE in a variety of scenarios, including image and news article classification using both feed-forward and convolutional neural networks. Analysis of the latent spaces created with different similarity functions facilitates insights on the different use cases COREL variants can satisfy, where the Cosine-COREL variant makes a consistently clusterable latent space, while Gaussian-COREL consistently obtains better classification accuracy than CCE.