CVFeb 27, 2019

FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stochastic Inference

arXiv:1902.10421v2448 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of obtaining pixel-level segmentation from coarse image-level annotations for computer vision applications, representing an incremental improvement over existing methods.

The paper tackled the problem of weakly and semi-supervised semantic image segmentation by addressing the limitation of localization maps that only focus on small discriminative parts of objects, resulting in improved performance on the Pascal VOC 2012 benchmark.

The main obstacle to weakly supervised semantic image segmentation is the difficulty of obtaining pixel-level information from coarse image-level annotations. Most methods based on image-level annotations use localization maps obtained from the classifier, but these only focus on the small discriminative parts of objects and do not capture precise boundaries. FickleNet explores diverse combinations of locations on feature maps created by generic deep neural networks. It selects hidden units randomly and then uses them to obtain activation scores for image classification. FickleNet implicitly learns the coherence of each location in the feature maps, resulting in a localization map which identifies both discriminative and other parts of objects. The ensemble effects are obtained from a single network by selecting random hidden unit pairs, which means that a variety of localization maps are generated from a single image. Our approach does not require any additional training steps and only adds a simple layer to a standard convolutional neural network; nevertheless it outperforms recent comparable techniques on the Pascal VOC 2012 benchmark in both weakly and semi-supervised settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes