CV AIFeb 26, 2022

RIConv++: Effective Rotation Invariant Convolutions for 3D Point Clouds Deep Learning

Zhiyuan Zhang, Binh-Son Hua, Sai-Kit Yeung

arXiv:2202.13094v214.973 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a key limitation in 3D point cloud deep learning for scene understanding tasks, offering a practical improvement for applications like robotics and autonomous driving, though it is incremental as it builds on existing rotation-invariant methods.

The paper tackles the problem of achieving rotation invariance in 3D point cloud convolutions, which often underperform compared to translation-invariant methods due to less distinctive features, and proposes a new convolution operator that improves feature descriptiveness, achieving state-of-the-art accuracy in classification, segmentation, and retrieval tasks under challenging rotations.

3D point clouds deep learning is a promising field of research that allows a neural network to learn features of point clouds directly, making it a robust tool for solving 3D scene understanding tasks. While recent works show that point cloud convolutions can be invariant to translation and point permutation, investigations of the rotation invariance property for point cloud convolution has been so far scarce. Some existing methods perform point cloud convolutions with rotation-invariant features, existing methods generally do not perform as well as translation-invariant only counterpart. In this work, we argue that a key reason is that compared to point coordinates, rotation-invariant features consumed by point cloud convolution are not as distinctive. To address this problem, we propose a simple yet effective convolution operator that enhances feature distinction by designing powerful rotation invariant features from the local regions. We consider the relationship between the point of interest and its neighbors as well as the internal relationship of the neighbors to largely improve the feature descriptiveness. Our network architecture can capture both local and global context by simply tuning the neighborhood size in each convolution layer. We conduct several experiments on synthetic and real-world point cloud classifications, part segmentation, and shape retrieval to evaluate our method, which achieves the state-of-the-art accuracy under challenging rotations.

View on arXiv PDF Code

Similar