CVNov 3, 2021

HS3: Learning with Proper Task Complexity in Hierarchically Supervised Semantic Segmentation

arXiv:2111.02333v120 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving semantic segmentation accuracy by optimizing hierarchical supervision, which is incremental but offers practical gains for computer vision applications.

The paper tackles the problem of varying representation powers in transitional layers of deeply supervised segmentation networks by proposing HS3, a training scheme that supervises intermediate layers with varying task complexity, and HS3-Fuse, a fusion framework to aggregate hierarchical features. The results show that HS3 outperforms vanilla deep supervision with no added inference cost, and HS3-Fuse achieves state-of-the-art results on NYUD-v2 and Cityscapes benchmarks.

While deeply supervised networks are common in recent literature, they typically impose the same learning objective on all transitional layers despite their varying representation powers. In this paper, we propose Hierarchically Supervised Semantic Segmentation (HS3), a training scheme that supervises intermediate layers in a segmentation network to learn meaningful representations by varying task complexity. To enforce a consistent performance vs. complexity trade-off throughout the network, we derive various sets of class clusters to supervise each transitional layer of the network. Furthermore, we devise a fusion framework, HS3-Fuse, to aggregate the hierarchical features generated by these layers, which can provide rich semantic contexts and further enhance the final segmentation. Extensive experiments show that our proposed HS3 scheme considerably outperforms vanilla deep supervision with no added inference cost. Our proposed HS3-Fuse framework further improves segmentation predictions and achieves state-of-the-art results on two large segmentation benchmarks: NYUD-v2 and Cityscapes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes