CVApr 6, 2024

Rethinking Self-training for Semi-supervised Landmark Detection: A Selection-free Approach

arXiv:2404.04556v26.56 citationsh-index: 11Has CodeIEEE Transactions on Image Processing

Originality Incremental advance

AI Analysis

This work solves the problem of improving landmark detection accuracy in semi-supervised learning for researchers and practitioners in computer vision and medical imaging, though it is incremental as it builds on existing self-training paradigms.

The paper tackles the problem of applying self-training to semi-supervised landmark detection by addressing issues like data bias and threshold sensitivity, proposing a selection-free method called STLD that uses a task curriculum. The result is that STLD consistently outperforms existing methods on facial and medical landmark detection benchmarks in semi- and omni-supervised settings.

Self-training is a simple yet effective method for semi-supervised learning, during which pseudo-label selection plays an important role for handling confirmation bias. Despite its popularity, applying self-training to landmark detection faces three problems: 1) The selected confident pseudo-labels often contain data bias, which may hurt model performance; 2) It is not easy to decide a proper threshold for sample selection as the localization task can be sensitive to noisy pseudo-labels; 3) coordinate regression does not output confidence, making selection-based self-training infeasible. To address the above issues, we propose Self-Training for Landmark Detection (STLD), a method that does not require explicit pseudo-label selection. Instead, STLD constructs a task curriculum to deal with confirmation bias, which progressively transitions from more confident to less confident tasks over the rounds of self-training. Pseudo pretraining and shrink regression are two essential components for such a curriculum, where the former is the first task of the curriculum for providing a better model initialization and the latter is further added in the later rounds to directly leverage the pseudo-labels in a coarse-to-fine manner. Experiments on three facial and one medical landmark detection benchmark show that STLD outperforms the existing methods consistently in both semi- and omni-supervised settings. The code is available at https://github.com/jhb86253817/STLD.

View on arXiv PDF Code

Similar