CVApr 12, 2017

Deep Contextual Recurrent Residual Networks for Scene Labeling

T. Hoang Ngan Le, Chi Nhan Duong, Ligong Han, Khoa Luu, Marios Savvides, Dipan Pal

arXiv:1704.03594v13.84 citations

Originality Incremental advance

AI Analysis

This work addresses scene labeling for computer vision applications, presenting an incremental improvement by integrating context modeling into existing architectures.

The paper tackles the problem of scene labeling by addressing the limitation of deep residual networks in capturing long-range contextual dependence, proposing Contextual Recurrent Residual Networks (CRRN) that achieve competitive performance on four challenging datasets.

Designed as extremely deep architectures, deep residual networks which provide a rich visual representation and offer robust convergence behaviors have recently achieved exceptional performance in numerous computer vision problems. Being directly applied to a scene labeling problem, however, they were limited to capture long-range contextual dependence, which is a critical aspect. To address this issue, we propose a novel approach, Contextual Recurrent Residual Networks (CRRN) which is able to simultaneously handle rich visual representation learning and long-range context modeling within a fully end-to-end deep network. Furthermore, our proposed end-to-end CRRN is completely trained from scratch, without using any pre-trained models in contrast to most existing methods usually fine-tuned from the state-of-the-art pre-trained models, e.g. VGG-16, ResNet, etc. The experiments are conducted on four challenging scene labeling datasets, i.e. SiftFlow, CamVid, Stanford background and SUN datasets, and compared against various state-of-the-art scene labeling methods.

View on arXiv PDF

Similar