MUSCLE: Strengthening Semi-Supervised Learning Via Concurrent Unsupervised Learning Using Mutual Information Maximization
This work provides a method to improve the robustness and performance of semi-supervised learning for deep neural networks, particularly when labeled data is scarce or biased, which is a common challenge for practitioners in various domains.
This paper addresses the degradation of semi-supervised learning (SSL) performance with reduced labeled data by introducing MUSCLE, a hybrid approach that concurrently combines unsupervised and semi-supervised learning using mutual information maximization. MUSCLE outperforms state-of-the-art methods on CIFAR-10, CIFAR-100, and Mini-Imagenet, with performance gains increasing as labeled data decreases and in the presence of bias.
Deep neural networks are powerful, massively parameterized machine learning models that have been shown to perform well in supervised learning tasks. However, very large amounts of labeled data are usually needed to train deep neural networks. Several semi-supervised learning approaches have been proposed to train neural networks using smaller amounts of labeled data with a large amount of unlabeled data. The performance of these semi-supervised methods significantly degrades as the size of labeled data decreases. We introduce Mutual-information-based Unsupervised & Semi-supervised Concurrent LEarning (MUSCLE), a hybrid learning approach that uses mutual information to combine both unsupervised and semi-supervised learning. MUSCLE can be used as a stand-alone training scheme for neural networks, and can also be incorporated into other learning approaches. We show that the proposed hybrid model outperforms state of the art on several standard benchmarks, including CIFAR-10, CIFAR-100, and Mini-Imagenet. Furthermore, the performance gain consistently increases with the reduction in the amount of labeled data, as well as in the presence of bias. We also show that MUSCLE has the potential to boost the classification performance when used in the fine-tuning phase for a model pre-trained only on unlabeled data.