CV LGDec 8, 2020

Canonical Capsules: Self-Supervised Capsules in Canonical Pose

Weiwei Sun, Andrea Tagliasacchi, Boyang Deng, Sara Sabour, Soroosh Yazdani, Geoffrey Hinton, Kwang Moo Yi

arXiv:2012.04718v221.744 citationsHas Code

Originality Highly original

AI Analysis

This work addresses the problem of learning robust, object-centric representations for 3D point clouds in a self-supervised manner, which is significant for researchers and practitioners working with 3D data.

This paper introduces a self-supervised capsule architecture for 3D point clouds that learns object decompositions and canonicalization by training with randomly rotated object pairs. The method achieves state-of-the-art performance in 3D point cloud reconstruction, canonicalization, and unsupervised classification without requiring classification labels or manually-aligned datasets.

We propose a self-supervised capsule architecture for 3D point clouds. We compute capsule decompositions of objects through permutation-equivariant attention, and self-supervise the process by training with pairs of randomly rotated objects. Our key idea is to aggregate the attention masks into semantic keypoints, and use these to supervise a decomposition that satisfies the capsule invariance/equivariance properties. This not only enables the training of a semantically consistent decomposition, but also allows us to learn a canonicalization operation that enables object-centric reasoning. To train our neural network we require neither classification labels nor manually-aligned training datasets. Yet, by learning an object-centric representation in a self-supervised manner, our method outperforms the state-of-the-art on 3D point cloud reconstruction, canonicalization, and unsupervised classification.

View on arXiv PDF Code

Similar