LGMay 4, 2023

Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label Learning

arXiv:2305.02795v241 citations
Originality Incremental advance
AI Analysis

This work addresses challenges in semi-supervised multi-label learning for applications like image or text classification, representing an incremental improvement over existing pseudo-labeling methods.

The paper tackled the problem of pseudo-labeling in semi-supervised multi-label learning, where conventional methods struggle with multiple labels and unknown label counts, leading to false positives or missed true positives. The proposed Class-Aware Pseudo-Labeling (CAP) method, using class-distribution-aware thresholds, achieved effective results as confirmed by extensive experiments on benchmark datasets.

Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data. However, in the context of semi-supervised multi-label learning (SSMLL), conventional pseudo-labeling methods encounter difficulties when dealing with instances associated with multiple labels and an unknown label count. These limitations often result in the introduction of false positive labels or the neglect of true positive ones. To overcome these challenges, this paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner. The proposed approach introduces a regularized learning framework incorporating class-aware thresholds, which effectively control the assignment of positive and negative pseudo-labels for each class. Notably, even with a small proportion of labeled examples, our observations demonstrate that the estimated class distribution serves as a reliable approximation. Motivated by this finding, we develop a class-distribution-aware thresholding strategy to ensure the alignment of pseudo-label distribution with the true distribution. The correctness of the estimated class distribution is theoretically verified, and a generalization error bound is provided for our proposed method. Extensive experiments on multiple benchmark datasets confirm the efficacy of CAP in addressing the challenges of SSMLL problems.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes