Semi-Supervised Regression with Heteroscedastic Pseudo-Labels
This work addresses a specific problem in semi-supervised regression for researchers and practitioners, offering an incremental improvement over existing pseudo-labeling methods by handling heteroscedastic noise.
The paper tackles the challenge of applying pseudo-labeling to semi-supervised regression by addressing heteroscedastic noise, proposing an uncertainty-aware framework that dynamically adjusts pseudo-label influence to mitigate error accumulation and overfitting, with experiments showing superior robustness and performance compared to existing methods.
Pseudo-labeling is a commonly used paradigm in semi-supervised learning, yet its application to semi-supervised regression (SSR) remains relatively under-explored. Unlike classification, where pseudo-labels are discrete and confidence-based filtering is effective, SSR involves continuous outputs with heteroscedastic noise, making it challenging to assess pseudo-label reliability. As a result, naive pseudo-labeling can lead to error accumulation and overfitting to incorrect labels. To address this, we propose an uncertainty-aware pseudo-labeling framework that dynamically adjusts pseudo-label influence from a bi-level optimization perspective. By jointly minimizing empirical risk over all data and optimizing uncertainty estimates to enhance generalization on labeled data, our method effectively mitigates the impact of unreliable pseudo-labels. We provide theoretical insights and extensive experiments to validate our approach across various benchmark SSR datasets, and the results demonstrate superior robustness and performance compared to existing methods. Our code is available at https://github.com/sxq/Heteroscedastic-Pseudo-Labels.