CVNov 22, 2022

MagicPony: Learning Articulated 3D Animals in the Wild

Oxford
arXiv:2211.12497v3118 citationsh-index: 105
Originality Highly original
AI Analysis

This addresses the challenge of 3D reconstruction from in-the-wild images for applications in computer vision and graphics, representing a strong specific gain.

The paper tackles the problem of predicting 3D shape, articulation, viewpoint, texture, and lighting of articulated animals from a single image, with MagicPony outperforming prior work and showing excellent generalization to art reconstruction.

We consider the problem of predicting the 3D shape, articulation, viewpoint, texture, and lighting of an articulated animal like a horse given a single test image as input. We present a new method, dubbed MagicPony, that learns this predictor purely from in-the-wild single-view images of the object category, with minimal assumptions about the topology of deformation. At its core is an implicit-explicit representation of articulated shape and appearance, combining the strengths of neural fields and meshes. In order to help the model understand an object's shape and pose, we distil the knowledge captured by an off-the-shelf self-supervised vision transformer and fuse it into the 3D model. To overcome local optima in viewpoint estimation, we further introduce a new viewpoint sampling scheme that comes at no additional training cost. MagicPony outperforms prior work on this challenging task and demonstrates excellent generalisation in reconstructing art, despite the fact that it is only trained on real images.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes