Indoor Semantic Segmentation using depth information
This work addresses indoor scene understanding for applications like robotics or augmented reality, but it is incremental as it builds on existing deep learning approaches with depth data.
The paper tackles multi-class segmentation of indoor scenes using RGB-D inputs by applying a multiscale convolutional network to learn features directly from images and depth, achieving state-of-the-art accuracy of 64.5% on the NYU-v2 depth dataset.
This work addresses multi-class segmentation of indoor scenes with RGB-D inputs. While this area of research has gained much attention recently, most works still rely on hand-crafted features. In contrast, we apply a multiscale convolutional network to learn features directly from the images and the depth information. We obtain state-of-the-art on the NYU-v2 depth dataset with an accuracy of 64.5%. We illustrate the labeling of indoor scenes in videos sequences that could be processed in real-time using appropriate hardware such as an FPGA.