CVJul 26, 2016

Region-based semantic segmentation with end-to-end training

arXiv:1607.07671v186 citations
Originality Incremental advance
AI Analysis

This addresses the problem of improving pixel-level labeling accuracy in computer vision, particularly at object boundaries, though it is incremental by combining existing paradigms.

The paper tackles semantic segmentation by enabling end-to-end training for region-based methods, achieving state-of-the-art results with 64.0% class-average accuracy on SIFT Flow and 49.9% on PASCAL Context.

We propose a novel method for semantic segmentation, the task of labeling each pixel in an image with a semantic class. Our method combines the advantages of the two main competing paradigms. Methods based on region classification offer proper spatial support for appearance measurements, but typically operate in two separate stages, none of which targets pixel labeling performance at the end of the pipeline. More recent fully convolutional methods are capable of end-to-end training for the final pixel labeling, but resort to fixed patches as spatial support. We show how to modify modern region-based approaches to enable end-to-end training for semantic segmentation. This is achieved via a differentiable region-to-pixel layer and a differentiable free-form Region-of-Interest pooling layer. Our method improves the state-of-the-art in terms of class-average accuracy with 64.0% on SIFT Flow and 49.9% on PASCAL Context, and is particularly accurate at object boundaries.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes