LGAICVMLMay 6, 2019

MixMatch: A Holistic Approach to Semi-Supervised Learning

arXiv:1905.02249v23528 citations
AI Analysis

This work addresses the reliance on large labeled datasets for machine learning practitioners by improving semi-supervised learning, showing incremental advancements through integration of existing methods.

The paper tackles the problem of semi-supervised learning by unifying existing approaches into MixMatch, which guesses low-entropy labels for augmented unlabeled data and mixes labeled and unlabeled data using MixUp, resulting in state-of-the-art performance such as reducing error rate from 38% to 11% on CIFAR-10 with 250 labels.

Semi-supervised learning has proven to be a powerful paradigm for leveraging unlabeled data to mitigate the reliance on large labeled datasets. In this work, we unify the current dominant approaches for semi-supervised learning to produce a new algorithm, MixMatch, that works by guessing low-entropy labels for data-augmented unlabeled examples and mixing labeled and unlabeled data using MixUp. We show that MixMatch obtains state-of-the-art results by a large margin across many datasets and labeled data amounts. For example, on CIFAR-10 with 250 labels, we reduce error rate by a factor of 4 (from 38% to 11%) and by a factor of 2 on STL-10. We also demonstrate how MixMatch can help achieve a dramatically better accuracy-privacy trade-off for differential privacy. Finally, we perform an ablation study to tease apart which components of MixMatch are most important for its success.

Code Implementations30 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes