CVLGOct 22, 2018

Unsupervised Learning of Shape and Pose with Differentiable Point Clouds

arXiv:1810.09381v1262 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of 3D reconstruction from images without labels, which is incremental as it builds on existing methods with novel components for pose ambiguity and shape representation.

The paper tackles the problem of learning 3D shape and camera pose from unlabeled images by training a convolutional network to minimize reprojection error, using an ensemble of pose predictors and differentiable point clouds, resulting in accurate pose estimation and detailed shape models.

We address the problem of learning accurate 3D shape and camera pose from a collection of unlabeled category-specific images. We train a convolutional network to predict both the shape and the pose from a single image by minimizing the reprojection error: given several views of an object, the projections of the predicted shapes to the predicted camera poses should match the provided views. To deal with pose ambiguity, we introduce an ensemble of pose predictors which we then distill to a single "student" model. To allow for efficient learning of high-fidelity shapes, we represent the shapes by point clouds and devise a formulation allowing for differentiable projection of these. Our experiments show that the distilled ensemble of pose predictors learns to estimate the pose accurately, while the point cloud representation allows to predict detailed shape models. The supplementary video can be found at https://www.youtube.com/watch?v=LuIGovKeo60

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes