LG MLDec 2, 2019

Combining MixMatch and Active Learning for Better Accuracy with Fewer Labels

Shuang Song, David Berthelot, Afshin Rostamizadeh

arXiv:1912.00594v211.836 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of reducing labeling costs in machine learning for practitioners, though it is incremental as it builds on existing MixMatch and active learning methods.

The paper tackles the problem of improving semi-supervised learning accuracy with fewer labeled examples by combining MixMatch with active learning, achieving up to 1.5% absolute accuracy gains on benchmark datasets like CIFAR-10, CIFAR-100, and SVHN.

We propose using active learning based techniques to further improve the state-of-the-art semi-supervised learning MixMatch algorithm. We provide a thorough empirical evaluation of several active-learning and baseline methods, which successfully demonstrate a significant improvement on the benchmark CIFAR-10, CIFAR-100, and SVHN datasets (as much as 1.5% in absolute accuracy). We also provide an empirical analysis of the cost trade-off between incrementally gathering more labeled versus unlabeled data. This analysis can be used to measure the relative value of labeled/unlabeled data at different points of the learning curve, where we find that although the incremental value of labeled data can be as much as 20x that of unlabeled, it quickly diminishes to less than 3x once more than 2,000 labeled example are observed. Code can be found at https://github.com/google-research/mma.

View on arXiv PDF Code

Similar