CVNov 10, 2015

Semantic Image Segmentation with Task-Specific Edge Detection Using CNNs and a Discriminatively Trained Domain Transform

arXiv:1511.03328v2378 citations
Originality Incremental advance
AI Analysis

This addresses the problem of slow inference in semantic segmentation for computer vision applications, offering a faster alternative with competitive accuracy, though it is incremental as it builds on existing CNN and filtering methods.

The paper tackles the computational expense of dense conditional random fields (CRFs) in semantic image segmentation by replacing them with domain transform filtering, which is several times faster and yields comparable results, accurately capturing object boundaries. It also learns task-specific edge maps from CNN features in an end-to-end trainable system to optimize segmentation quality.

Deep convolutional neural networks (CNNs) are the backbone of state-of-art semantic image segmentation systems. Recent work has shown that complementing CNNs with fully-connected conditional random fields (CRFs) can significantly enhance their object localization accuracy, yet dense CRF inference is computationally expensive. We propose replacing the fully-connected CRF with domain transform (DT), a modern edge-preserving filtering method in which the amount of smoothing is controlled by a reference edge map. Domain transform filtering is several times faster than dense CRF inference and we show that it yields comparable semantic segmentation results, accurately capturing object boundaries. Importantly, our formulation allows learning the reference edge map from intermediate CNN features instead of using the image gradient magnitude as in standard DT filtering. This produces task-specific edges in an end-to-end trainable system optimizing the target semantic segmentation quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes