The Triad of Failure Modes and a Possible Way Out
This work addresses a specific problem in self-supervised learning for researchers, but it appears incremental as it builds on existing cluster-based methods.
The authors tackled the problem of failure modes in cluster-based self-supervised learning by proposing a novel objective function that addresses representation collapse, cluster collapse, and invariance to permutations, resulting in effective performance on toy and real-world data.
We present a novel objective function for cluster-based self-supervised learning (SSL) that is designed to circumvent the triad of failure modes, namely representation collapse, cluster collapse, and the problem of invariance to permutations of cluster assignments. This objective consists of three key components: (i) A generative term that penalizes representation collapse, (ii) a term that promotes invariance to data augmentations, thereby addressing the issue of label permutations and (ii) a uniformity term that penalizes cluster collapse. Additionally, our proposed objective possesses two notable advantages. Firstly, it can be interpreted from a Bayesian perspective as a lower bound on the data log-likelihood. Secondly, it enables the training of a standard backbone architecture without the need for asymmetric elements like stop gradients, momentum encoders, or specialized clustering layers. Due to its simplicity and theoretical foundation, our proposed objective is well-suited for optimization. Experiments on both toy and real world data demonstrate its effectiveness