CVApr 12, 2017

Deep Contextual Recurrent Residual Networks for Scene Labeling

arXiv:1704.03594v14 citations
Originality Incremental advance
AI Analysis

This work addresses scene labeling for computer vision applications, presenting an incremental improvement by integrating context modeling into existing architectures.

The paper tackles the problem of scene labeling by addressing the limitation of deep residual networks in capturing long-range contextual dependence, proposing Contextual Recurrent Residual Networks (CRRN) that achieve competitive performance on four challenging datasets.

Designed as extremely deep architectures, deep residual networks which provide a rich visual representation and offer robust convergence behaviors have recently achieved exceptional performance in numerous computer vision problems. Being directly applied to a scene labeling problem, however, they were limited to capture long-range contextual dependence, which is a critical aspect. To address this issue, we propose a novel approach, Contextual Recurrent Residual Networks (CRRN) which is able to simultaneously handle rich visual representation learning and long-range context modeling within a fully end-to-end deep network. Furthermore, our proposed end-to-end CRRN is completely trained from scratch, without using any pre-trained models in contrast to most existing methods usually fine-tuned from the state-of-the-art pre-trained models, e.g. VGG-16, ResNet, etc. The experiments are conducted on four challenging scene labeling datasets, i.e. SiftFlow, CamVid, Stanford background and SUN datasets, and compared against various state-of-the-art scene labeling methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes