DRINet: A Dual-Representation Iterative Learning Network for Point Cloud Segmentation
This work addresses the problem of efficient and accurate point cloud segmentation for applications like autonomous driving, though it appears incremental as it builds on existing dual-representation ideas.
The authors tackled point cloud segmentation by proposing DRINet, a dual-representation iterative learning network that propagates features between point and voxel representations, achieving state-of-the-art results with a real-time inference speed of 62ms per frame on large-scale outdoor datasets.
We present a novel and flexible architecture for point cloud segmentation with dual-representation iterative learning. In point cloud processing, different representations have their own pros and cons. Thus, finding suitable ways to represent point cloud data structure while keeping its own internal physical property such as permutation and scale-invariant is a fundamental problem. Therefore, we propose our work, DRINet, which serves as the basic network structure for dual-representation learning with great flexibility at feature transferring and less computation cost, especially for large-scale point clouds. DRINet mainly consists of two modules called Sparse Point-Voxel Feature Extraction and Sparse Voxel-Point Feature Extraction. By utilizing these two modules iteratively, features can be propagated between two different representations. We further propose a novel multi-scale pooling layer for pointwise locality learning to improve context information propagation. Our network achieves state-of-the-art results for point cloud classification and segmentation tasks on several datasets while maintaining high runtime efficiency. For large-scale outdoor scenarios, our method outperforms state-of-the-art methods with a real-time inference speed of 62ms per frame.