CVLGOct 18, 2024

Pseudo-label Refinement for Improving Self-Supervised Learning Systems

arXiv:2410.14242v15 citationsh-index: 2
Originality Incremental advance
AI Analysis

This addresses performance degradation in self-supervised learning systems for computer vision tasks like person re-identification, though it appears incremental as it refines existing pseudo-labeling approaches.

The paper tackles noise in pseudo-labels from clustering-based self-supervised learning by proposing a pseudo-label refinement algorithm that projects and combines labels across epochs, then uses hierarchical clustering to generate refined hard-labels. It demonstrates improved mean Average Precision in unsupervised domain adaptation for person re-identification across various scenarios.

Self-supervised learning systems have gained significant attention in recent years by leveraging clustering-based pseudo-labels to provide supervision without the need for human annotations. However, the noise in these pseudo-labels caused by the clustering methods poses a challenge to the learning process leading to degraded performance. In this work, we propose a pseudo-label refinement (SLR) algorithm to address this issue. The cluster labels from the previous epoch are projected to the current epoch cluster-labels space and a linear combination of the new label and the projected label is computed as a soft refined label containing the information from the previous epoch clusters as well as from the current epoch. In contrast to the common practice of using the maximum value as a cluster/class indicator, we employ hierarchical clustering on these soft pseudo-labels to generate refined hard-labels. This approach better utilizes the information embedded in the soft labels, outperforming the simple maximum value approach for hard label generation. The effectiveness of the proposed SLR algorithm is evaluated in the context of person re-identification (Re-ID) using unsupervised domain adaptation (UDA). Experimental results demonstrate that the modified Re-ID baseline, incorporating the SLR algorithm, achieves significantly improved mean Average Precision (mAP) performance in various UDA tasks, including real-to-synthetic, synthetic-to-real, and different real-to-real scenarios. These findings highlight the efficacy of the SLR algorithm in enhancing the performance of self-supervised learning systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes