FlexSSL : A Generic and Efficient Framework for Semi-Supervised Learning
This addresses the challenge of leveraging unlabeled data more effectively in semi-supervised learning, though it appears incremental as it builds on existing self-training paradigms.
The authors tackled the problem of semi-supervised learning algorithms over-relying on limited labeled data by proposing FlexSSL, a framework that jointly solves the main task and an auxiliary task of discriminating label observability, which consistently enhanced performance across diverse tasks.
Semi-supervised learning holds great promise for many real-world applications, due to its ability to leverage both unlabeled and expensive labeled data. However, most semi-supervised learning algorithms still heavily rely on the limited labeled data to infer and utilize the hidden information from unlabeled data. We note that any semi-supervised learning task under the self-training paradigm also hides an auxiliary task of discriminating label observability. Jointly solving these two tasks allows full utilization of information from both labeled and unlabeled data, thus alleviating the problem of over-reliance on labeled data. This naturally leads to a new generic and efficient learning framework without the reliance on any domain-specific information, which we call FlexSSL. The key idea of FlexSSL is to construct a semi-cooperative "game", which forges cooperation between a main self-interested semi-supervised learning task and a companion task that infers label observability to facilitate main task training. We show with theoretical derivation of its connection to loss re-weighting on noisy labels. Through evaluations on a diverse range of tasks, we demonstrate that FlexSSL can consistently enhance the performance of semi-supervised learning algorithms.