CVAug 27, 2016

Multi-Path Feedback Recurrent Neural Network for Scene Parsing

arXiv:1608.07706v34 citations
Originality Incremental advance
AI Analysis

This work addresses scene parsing for computer vision applications, presenting an incremental advancement with novel components for improved performance.

The paper tackles the scene parsing problem by proposing a Multi-Path Feedback recurrent neural network (MPF-RNN) to enhance long-range context modeling and distinguish confusing pixels, achieving significant improvements over strong baselines like VGG16 and Res101 on five challenging benchmarks including ADE20K.

In this paper, we consider the scene parsing problem and propose a novel Multi-Path Feedback recurrent neural network (MPF-RNN) for parsing scene images. MPF-RNN can enhance the capability of RNNs in modeling long-range context information at multiple levels and better distinguish pixels that are easy to confuse. Different from feedforward CNNs and RNNs with only single feedback, MPF-RNN propagates the contextual features learned at top layer through \textit{multiple} weighted recurrent connections to learn bottom features. For better training MPF-RNN, we propose a new strategy that considers accumulative loss at multiple recurrent steps to improve performance of the MPF-RNN on parsing small objects. With these two novel components, MPF-RNN has achieved significant improvement over strong baselines (VGG16 and Res101) on five challenging scene parsing benchmarks, including traditional SiftFlow, Barcelona, CamVid, Stanford Background as well as the recently released large-scale ADE20K.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes