CVDec 4, 2018

SurfConv: Bridging 3D and 2D Convolution for RGBD Images

arXiv:1812.01519v126 citations
Originality Highly original
AI Analysis

This addresses the memory inefficiency of 3D convolution for RGBD recognition tasks, offering a more parameter-efficient solution for applications like indoor and outdoor semantic segmentation.

The paper tackles the problem of efficiently incorporating 3D information into convolutional neural networks for RGBD images by proposing SurfConv, which uses depth-aware 2D convolution on visible surfaces, achieving state-of-the-art performance with less than 30% of the parameters of 3D convolution-based methods.

We tackle the problem of using 3D information in convolutional neural networks for down-stream recognition tasks. Using depth as an additional channel alongside the RGB input has the scale variance problem present in image convolution based approaches. On the other hand, 3D convolution wastes a large amount of memory on mostly unoccupied 3D space, which consists of only the surface visible to the sensor. Instead, we propose SurfConv, which "slides" compact 2D filters along the visible 3D surface. SurfConv is formulated as a simple depth-aware multi-scale 2D convolution, through a new Data-Driven Depth Discretization (D4) scheme. We demonstrate the effectiveness of our method on indoor and outdoor 3D semantic segmentation datasets. Our method achieves state-of-the-art performance with less than 30% parameters used by the 3D convolution-based approaches.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes