CVFeb 26, 2025

Knowledge Distillation for Semantic Segmentation: A Label Space Unification Approach

arXiv:2502.19177v1h-index: 2IROS
Originality Incremental advance
AI Analysis

This addresses the challenge of training better models for autonomous driving by integrating multiple datasets, though it is incremental as it builds on existing knowledge distillation methods.

The paper tackles the problem of inconsistent taxonomies and labeling policies across semantic segmentation datasets by proposing a knowledge distillation approach that unifies label spaces, resulting in student models outperforming teachers in urban and off-road driving domains with datasets of 388,230 and 18,558 images.

An increasing number of datasets sharing similar domains for semantic segmentation have been published over the past few years. But despite the growing amount of overall data, it is still difficult to train bigger and better models due to inconsistency in taxonomy and/or labeling policies of different datasets. To this end, we propose a knowledge distillation approach that also serves as a label space unification method for semantic segmentation. In short, a teacher model is trained on a source dataset with a given taxonomy, then used to pseudo-label additional data for which ground truth labels of a related label space exist. By mapping the related taxonomies to the source taxonomy, we create constraints within which the model can predict pseudo-labels. Using the improved pseudo-labels we train student models that consistently outperform their teachers in two challenging domains, namely urban and off-road driving. Our ground truth-corrected pseudo-labels span over 12 and 7 public datasets with 388.230 and 18.558 images for the urban and off-road domains, respectively, creating the largest compound datasets for autonomous driving to date.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes