CVAILGJan 19, 2022

ConDor: Self-Supervised Canonicalization of 3D Pose for Partial Shapes

arXiv:2201.07788v246 citations
AI Analysis

This addresses the challenge of inconsistent 3D pose in real-world data for researchers and practitioners in 3D vision, though it is incremental as it builds on existing Tensor Field Networks.

The paper tackles the problem of generalizing 3D object understanding to in-the-wild shapes by introducing ConDor, a self-supervised method that canonicalizes 3D pose for full and partial point clouds, outperforming existing methods on four new metrics and enabling applications like operation on depth images and annotation transfer.

Progress in 3D object understanding has relied on manually canonicalized shape datasets that contain instances with consistent position and orientation (3D pose). This has made it hard to generalize these methods to in-the-wild shapes, eg., from internet model collections or depth sensors. ConDor is a self-supervised method that learns to Canonicalize the 3D orientation and position for full and partial 3D point clouds. We build on top of Tensor Field Networks (TFNs), a class of permutation- and rotation-equivariant, and translation-invariant 3D networks. During inference, our method takes an unseen full or partial 3D point cloud at an arbitrary pose and outputs an equivariant canonical pose. During training, this network uses self-supervision losses to learn the canonical pose from an un-canonicalized collection of full and partial 3D point clouds. ConDor can also learn to consistently co-segment object parts without any supervision. Extensive quantitative results on four new metrics show that our approach outperforms existing methods while enabling new applications such as operation on depth images and annotation transfer.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes