CVJul 23, 2020

Reliable Label Bootstrapping for Semi-Supervised Learning

Paul Albert, Diego Ortego, Eric Arazo, Noel E. O'Connor, Kevin McGuinness

arXiv:2007.11866v23.32 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of reducing human annotation efforts in machine learning, particularly for computer vision tasks, by enhancing semi-supervised methods in low-supervision settings, though it is incremental as it builds on existing techniques like self-supervised learning and label propagation.

The paper tackles the problem of training convolutional neural networks with very few labeled samples by proposing Reliable Label Bootstrapping (ReLaB), an unsupervised preprocessing algorithm that improves semi-supervised learning performance, achieving error rates as low as 8.46% on CIFAR-10 with highly representative labeled samples.

Reducing the amount of labels required to train convolutional neural networks without performance degradation is key to effectively reduce human annotation efforts. We propose Reliable Label Bootstrapping (ReLaB), an unsupervised preprossessing algorithm which improves the performance of semi-supervised algorithms in extremely low supervision settings. Given a dataset with few labeled samples, we first learn meaningful self-supervised, latent features for the data. Second, a label propagation algorithm propagates the known labels on the unsupervised features, effectively labeling the full dataset in an automatic fashion. Third, we select a subset of correctly labeled (reliable) samples using a label noise detection algorithm. Finally, we train a semi-supervised algorithm on the extended subset. We show that the selection of the network architecture and the self-supervised algorithm are important factors to achieve successful label propagation and demonstrate that ReLaB substantially improves semi-supervised learning in scenarios of very limited supervision on CIFAR-10, CIFAR-100 and mini-ImageNet. We reach average error rates of $\boldsymbol{22.34}$ with 1 random labeled sample per class on CIFAR-10 and lower this error to $\boldsymbol{8.46}$ when the labeled sample in each class is highly representative. Our work is fully reproducible: https://github.com/PaulAlbert31/ReLaB.

View on arXiv PDF Code

Similar