CVAug 5, 2018

3D Depthwise Convolution: Reducing Model Parameters in 3D Vision Tasks

arXiv:1808.01556v155 citations
Originality Incremental advance
AI Analysis

This addresses the bottleneck of model size and efficiency for researchers and practitioners in 3D vision, though it is incremental as it adapts an existing 2D technique to 3D.

The paper tackles the high memory and computational cost of standard 3D convolutions in deep neural networks for 3D vision tasks by introducing 3D depthwise convolution, which reduces parameters by more than an order of magnitude while maintaining comparable performance in tasks like classification and reconstruction.

Standard 3D convolution operations require much larger amounts of memory and computation cost than 2D convolution operations. The fact has hindered the development of deep neural nets in many 3D vision tasks. In this paper, we investigate the possibility of applying depthwise separable convolutions in 3D scenario and introduce the use of 3D depthwise convolution. A 3D depthwise convolution splits a single standard 3D convolution into two separate steps, which would drastically reduce the number of parameters in 3D convolutions with more than one order of magnitude. We experiment with 3D depthwise convolution on popular CNN architectures and also compare it with a similar structure called pseudo-3D convolution. The results demonstrate that, with 3D depthwise convolutions, 3D vision tasks like classification and reconstruction can be carried out with more light-weighted neural networks while still delivering comparable performances.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes