CVApr 22, 2021

Self-Supervised Learning from Semantically Imprecise Data

arXiv:2104.10901v2
AI Analysis

This work addresses the problem of training classifiers with scarce expert labels by leveraging abundant but imprecise data, offering an incremental improvement over existing methods.

The paper tackles the problem of learning from semantically imprecise labels (e.g., 'animal') to make precise predictions (e.g., 'snow bunting'), extending the CHILLAX method with a self-supervised scheme using constrained semantic extrapolation to generate pseudo-labels. This approach achieves a consistent accuracy improvement of 0.84 to 1.19 percentage points over CHILLAX.

Learning from imprecise labels such as "animal" or "bird", but making precise predictions like "snow bunting" at inference time is an important capability for any classifier when expertly labeled training data is scarce. Contributions by volunteers or results of web crawling lack precision in this manner, but are still valuable. And crucially, these weakly labeled examples are available in larger quantities for lower cost than high-quality bespoke training data. CHILLAX, a recently proposed method to tackle this task, leverages a hierarchical classifier to learn from imprecise labels. However, it has two major limitations. First, it does not learn from examples labeled as the root of the hierarchy, e.g., "object". Second, an extrapolation of annotations to precise labels is only performed at test time, where confident extrapolations could be already used as training data. In this work, we extend CHILLAX with a self-supervised scheme using constrained semantic extrapolation to generate pseudo-labels. This addresses the second concern, which in turn solves the first problem, enabling an even weaker supervision requirement than CHILLAX. We evaluate our approach empirically, showing that our method allows for a consistent accuracy improvement of 0.84 to 1.19 percent points over CHILLAX and is suitable as a drop-in replacement without any negative consequences such as longer training times.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes