FoldingNet: Point Cloud Auto-encoder via Deep Grid Deformation
This work addresses unsupervised representation learning for point clouds, which is incremental as it builds on existing methods like PointNet.
The authors tackled unsupervised learning on point clouds by proposing FoldingNet, an auto-encoder with a novel folding-based decoder that deforms a 2D grid onto 3D surfaces, achieving low reconstruction errors and higher linear SVM classification accuracy than benchmarks while using only about 7% of the parameters of a fully-connected decoder.
Recent deep networks that directly handle points in a point set, e.g., PointNet, have been state-of-the-art for supervised learning tasks on point clouds such as classification and segmentation. In this work, a novel end-to-end deep auto-encoder is proposed to address unsupervised learning challenges on point clouds. On the encoder side, a graph-based enhancement is enforced to promote local structures on top of PointNet. Then, a novel folding-based decoder deforms a canonical 2D grid onto the underlying 3D object surface of a point cloud, achieving low reconstruction errors even for objects with delicate structures. The proposed decoder only uses about 7% parameters of a decoder with fully-connected neural networks, yet leads to a more discriminative representation that achieves higher linear SVM classification accuracy than the benchmark. In addition, the proposed decoder structure is shown, in theory, to be a generic architecture that is able to reconstruct an arbitrary point cloud from a 2D grid. Our code is available at http://www.merl.com/research/license#FoldingNet