Flex-Convolution (Million-Scale Point-Cloud Learning Beyond Grid-Worlds)
This addresses the challenge of processing irregular point cloud data for applications like 3D vision, though it appears incremental as it builds on existing convolution methods.
The paper tackled the problem of applying convolution to unstructured 3D point clouds by introducing flex-convolution, a generalization of traditional convolution, and demonstrated competitive performance on small benchmarks with fewer parameters and significant improvements on a million-scale dataset, processing 7 million points concurrently.
Traditional convolution layers are specifically designed to exploit the natural data representation of images -- a fixed and regular grid. However, unstructured data like 3D point clouds containing irregular neighborhoods constantly breaks the grid-based data assumption. Therefore applying best-practices and design choices from 2D-image learning methods towards processing point clouds are not readily possible. In this work, we introduce a natural generalization flex-convolution of the conventional convolution layer along with an efficient GPU implementation. We demonstrate competitive performance on rather small benchmark sets using fewer parameters and lower memory consumption and obtain significant improvements on a million-scale real-world dataset. Ours is the first which allows to efficiently process 7 million points concurrently.