LGMLFeb 17, 2021

Sinkhorn Label Allocation: Semi-Supervised Classification via Annealed Self-Training

arXiv:2102.08622v255 citationsHas Code
Originality Highly original
AI Analysis

This work addresses semi-supervised learning for computer vision tasks, offering a novel method that is incremental in improving label assignment efficiency.

The paper tackles semi-supervised classification by reinterpreting self-training as an optimal transportation problem, using Sinkhorn iteration for efficient label assignment, and demonstrates effectiveness on CIFAR-10, CIFAR-100, and SVHN datasets compared to FixMatch.

Self-training is a standard approach to semi-supervised learning where the learner's own predictions on unlabeled data are used as supervision during training. In this paper, we reinterpret this label assignment process as an optimal transportation problem between examples and classes, wherein the cost of assigning an example to a class is mediated by the current predictions of the classifier. This formulation facilitates a practical annealing strategy for label assignment and allows for the inclusion of prior knowledge on class proportions via flexible upper bound constraints. The solutions to these assignment problems can be efficiently approximated using Sinkhorn iteration, thus enabling their use in the inner loop of standard stochastic optimization algorithms. We demonstrate the effectiveness of our algorithm on the CIFAR-10, CIFAR-100, and SVHN datasets in comparison with FixMatch, a state-of-the-art self-training algorithm. Our code is available at https://github.com/stanford-futuredata/sinkhorn-label-allocation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes