LGApr 14, 2024

Incremental Self-training for Semi-supervised Learning

arXiv:2404.12398v15 citationsh-index: 16
Originality Incremental advance
AI Analysis

This work addresses efficiency issues in semi-supervised learning for researchers and practitioners, but it is incremental as it builds on existing self-training methods.

The paper tackles the problem of inefficient data utilization and high time consumption in self-training for semi-supervised learning by proposing Incremental Self-training (IST), which processes data in batches and prioritizes high-certainty pseudo-labels, resulting in improved recognition accuracy and learning speed on five datasets and outperforming state-of-the-art methods on three image classification tasks.

Semi-supervised learning provides a solution to reduce the dependency of machine learning on labeled data. As one of the efficient semi-supervised techniques, self-training (ST) has received increasing attention. Several advancements have emerged to address challenges associated with noisy pseudo-labels. Previous works on self-training acknowledge the importance of unlabeled data but have not delved into their efficient utilization, nor have they paid attention to the problem of high time consumption caused by iterative learning. This paper proposes Incremental Self-training (IST) for semi-supervised learning to fill these gaps. Unlike ST, which processes all data indiscriminately, IST processes data in batches and priority assigns pseudo-labels to unlabeled samples with high certainty. Then, it processes the data around the decision boundary after the model is stabilized, enhancing classifier performance. Our IST is simple yet effective and fits existing self-training-based semi-supervised learning methods. We verify the proposed IST on five datasets and two types of backbone, effectively improving the recognition accuracy and learning speed. Significantly, it outperforms state-of-the-art competitors on three challenging image classification tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes