CVJan 24, 2024

Memory Consistency Guided Divide-and-Conquer Learning for Generalized Category Discovery

Yuanpeng Tu, Zhun Zhong, Yuxi Li, Hengshuang Zhao

arXiv:2401.13325v25.22 citations

Originality Incremental advance

AI Analysis

This work addresses a challenging semi-supervised learning setting for image recognition, offering a novel method to better utilize unlabeled data, though it is incremental as it builds on existing contrastive and clustering approaches.

The paper tackles the problem of generalized category discovery (GCD) in semi-supervised learning, where only some training samples have labels, by proposing a memory consistency guided divide-and-conquer learning framework that leverages historical predictions to improve accuracy, achieving gains of +8.4% on CUB and +8.1% on Stanford Cars datasets.

Generalized category discovery (GCD) aims at addressing a more realistic and challenging setting of semi-supervised learning, where only part of the category labels are assigned to certain training samples. Previous methods generally employ naive contrastive learning or unsupervised clustering scheme for all the samples. Nevertheless, they usually ignore the inherent critical information within the historical predictions of the model being trained. Specifically, we empirically reveal that a significant number of salient unlabeled samples yield consistent historical predictions corresponding to their ground truth category. From this observation, we propose a Memory Consistency guided Divide-and-conquer Learning framework (MCDL). In this framework, we introduce two memory banks to record historical prediction of unlabeled data, which are exploited to measure the credibility of each sample in terms of its prediction consistency. With the guidance of credibility, we can design a divide-and-conquer learning strategy to fully utilize the discriminative information of unlabeled data while alleviating the negative influence of noisy labels. Extensive experimental results on multiple benchmarks demonstrate the generality and superiority of our method, where our method outperforms state-of-the-art models by a large margin on both seen and unseen classes of the generic image recognition and challenging semantic shift settings (i.e.,with +8.4% gain on CUB and +8.1% on Standford Cars).

View on arXiv PDF

Similar