VoxSegNet: Volumetric CNNs for Semantic Part Segmentation of 3D Shapes
This addresses the challenge of detailed 3D shape analysis for applications like computer graphics and robotics, but it is incremental as it builds on existing volumetric CNN methods.
The paper tackles the problem of fine-grained part segmentation of 3D shapes using voxel data, where high resolution leads to computational issues, and proposes VoxSegNet with spatial dense extraction and attention feature aggregation modules to preserve detail under limited resolution, achieving effective results on a large-scale dataset.
Voxel is an important format to represent geometric data, which has been widely used for 3D deep learning in shape analysis due to its generalization ability and regular data format. However, fine-grained tasks like part segmentation require detailed structural information, which increases voxel resolution and thus causes other issues such as the exhaustion of computational resources. In this paper, we propose a novel volumetric convolutional neural network, which could extract discriminative features encoding detailed information from voxelized 3D data under a limited resolution. To this purpose, a spatial dense extraction (SDE) module is designed to preserve the spatial resolution during the feature extraction procedure, alleviating the loss of detail caused by sub-sampling operations such as max-pooling. An attention feature aggregation (AFA) module is also introduced to adaptively select informative features from different abstraction scales, leading to segmentation with both semantic consistency and high accuracy of details. Experiment results on the large-scale dataset demonstrate the effectiveness of our method in 3D shape part segmentation.