Learning Embeddings for Image Clustering: An Empirical Study of Triplet Loss Approaches
This work addresses image clustering with noisy labels, but it is incremental as it builds on existing triplet loss methods.
The paper tackles the problem of learning embeddings for image clustering by evaluating triplet loss approaches, finding that a new simple triplet loss formulation outperforms existing methods on CIFAR-10.
In this work, we evaluate two different image clustering objectives, k-means clustering and correlation clustering, in the context of Triplet Loss induced feature space embeddings. Specifically, we train a convolutional neural network to learn discriminative features by optimizing two popular versions of the Triplet Loss in order to study their clustering properties under the assumption of noisy labels. Additionally, we propose a new, simple Triplet Loss formulation, which shows desirable properties with respect to formal clustering objectives and outperforms the existing methods. We evaluate all three Triplet loss formulations for K-means and correlation clustering on the CIFAR-10 image classification dataset.