CVLGNov 23, 2021

Semi-Supervised Learning with Taxonomic Labels

arXiv:2111.11595v111 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of limited labeled data for fine-grained image classification, such as in biological domains, though it is incremental in leveraging existing semi-supervised techniques.

The paper tackles the problem of training image classifiers in fine-grained domains by incorporating coarse taxonomic labels, improving species-level classification accuracy by 6% with Phylum labels and an additional 1.3% with hierarchical methods in semi-supervised learning.

We propose techniques to incorporate coarse taxonomic labels to train image classifiers in fine-grained domains. Such labels can often be obtained with a smaller effort for fine-grained domains such as the natural world where categories are organized according to a biological taxonomy. On the Semi-iNat dataset consisting of 810 species across three Kingdoms, incorporating Phylum labels improves the Species level classification accuracy by 6% in a transfer learning setting using ImageNet pre-trained models. Incorporating the hierarchical label structure with a state-of-the-art semi-supervised learning algorithm called FixMatch improves the performance further by 1.3%. The relative gains are larger when detailed labels such as Class or Order are provided, or when models are trained from scratch. However, we find that most methods are not robust to the presence of out-of-domain data from novel classes. We propose a technique to select relevant data from a large collection of unlabeled images guided by the hierarchy which improves the robustness. Overall, our experiments show that semi-supervised learning with coarse taxonomic labels are practical for training classifiers in fine-grained domains.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes