RGBD Based Dimensional Decomposition Residual Network for 3D Semantic Scene Completion
This work addresses performance bottlenecks in 3D semantic scene completion for applications like robotics and autonomous driving by reducing computational costs while improving accuracy.
The paper tackles the problem of 3D semantic scene completion by proposing a lightweight network that uses both RGB and depth images to improve accuracy, achieving gains of 5.9% in shape completion and 5.7% in semantic scene completion with only 21% of the parameters and 16.6% of the FLOPs compared to the state-of-the-art method.
RGB images differentiate from depth images as they carry more details about the color and texture information, which can be utilized as a vital complementary to depth for boosting the performance of 3D semantic scene completion (SSC). SSC is composed of 3D shape completion (SC) and semantic scene labeling while most of the existing methods use depth as the sole input which causes the performance bottleneck. Moreover, the state-of-the-art methods employ 3D CNNs which have cumbersome networks and tremendous parameters. We introduce a light-weight Dimensional Decomposition Residual network (DDR) for 3D dense prediction tasks. The novel factorized convolution layer is effective for reducing the network parameters, and the proposed multi-scale fusion mechanism for depth and color image can improve the completion and segmentation accuracy simultaneously. Our method demonstrates excellent performance on two public datasets. Compared with the latest method SSCNet, we achieve 5.9% gains in SC-IoU and 5.7% gains in SSC-IOU, albeit with only 21% network parameters and 16.6% FLOPs employed compared with that of SSCNet.