CVJun 15, 2019

IMP: Instance Mask Projection for High Accuracy Semantic Segmentation of Things

arXiv:1906.06597v118 citations
Originality Incremental advance
AI Analysis

This addresses semantic segmentation accuracy for complex scenes like clothing parsing and street scenes, offering incremental improvements over existing methods.

The paper tackles semantic segmentation by introducing Instance Mask Projection (IMP), a trainable operator that projects instance segmentation predictions as features, improving mIOU by 3 points on VCP and Cityscapes and 20.4% on ModaNet compared to baselines.

In this work, we present a new operator, called Instance Mask Projection (IMP), which projects a predicted Instance Segmentation as a new feature for semantic segmentation. It also supports back propagation so is trainable end-to-end. Our experiments show the effectiveness of IMP on both Clothing Parsing (with complex layering, large deformations, and non-convex objects), and on Street Scene Segmentation (with many overlapping instances and small objects). On the Varied Clothing Parsing dataset (VCP), we show instance mask projection can improve 3 points on mIOU from a state-of-the-art Panoptic FPN segmentation approach. On the ModaNet clothing parsing dataset, we show a dramatic improvement of 20.4% absolutely compared to existing baseline semantic segmentation results. In addition, the instance mask projection operator works well on other (non-clothing) datasets, providing an improvement of 3 points in mIOU on Thing classes of Cityscapes, a self-driving dataset, on top of a state-of-the-art approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes