CVDec 14, 2014

Combining the Best of Graphical Models and ConvNets for Semantic Segmentation

Michael Cogswell, Xiao Lin, Senthil Purushwalkam, Dhruv Batra

arXiv:1412.4313v29 citations

Originality Incremental advance

AI Analysis

This work addresses semantic segmentation for computer vision, introducing a new CNN design that is incremental in combining existing techniques.

The paper tackles semantic segmentation by combining graphical models for generating diverse segmentation proposals with a novel CNN, SegNet, trained to optimize the PASCAL IOU loss, achieving 52.5% on the PASCAL 2012 challenge.

We present a two-module approach to semantic segmentation that incorporates Convolutional Networks (CNNs) and Graphical Models. Graphical models are used to generate a small (5-30) set of diverse segmentations proposals, such that this set has high recall. Since the number of required proposals is so low, we can extract fairly complex features to rank them. Our complex feature of choice is a novel CNN called SegNet, which directly outputs a (coarse) semantic segmentation. Importantly, SegNet is specifically trained to optimize the corpus-level PASCAL IOU loss function. To the best of our knowledge, this is the first CNN specifically designed for semantic segmentation. This two-module approach achieves $52.5\%$ on the PASCAL 2012 segmentation challenge.

View on arXiv PDF

Similar