Info3D: Representation Learning on 3D Objects using Mutual Information Maximization and Contrastive Learning
This work addresses the need for better unsupervised methods in 3D computer vision, offering improvements for tasks like shape analysis, though it appears incremental as it builds on existing principles.
The paper tackled the problem of unsupervised representation learning for 3D objects by extending InfoMax and contrastive learning principles to improve performance on aligned datasets and achieve rotation invariance, achieving state-of-the-art results in tasks like clustering, transfer learning, and shape retrieval.
A major endeavor of computer vision is to represent, understand and extract structure from 3D data. Towards this goal, unsupervised learning is a powerful and necessary tool. Most current unsupervised methods for 3D shape analysis use datasets that are aligned, require objects to be reconstructed and suffer from deteriorated performance on downstream tasks. To solve these issues, we propose to extend the InfoMax and contrastive learning principles on 3D shapes. We show that we can maximize the mutual information between 3D objects and their "chunks" to improve the representations in aligned datasets. Furthermore, we can achieve rotation invariance in SO${(3)}$ group by maximizing the mutual information between the 3D objects and their geometric transformed versions. Finally, we conduct several experiments such as clustering, transfer learning, shape retrieval, and achieve state of art results.