CVGRMay 10, 2023

Reconstructing Animatable Categories from Videos

arXiv:2305.06351v153 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of scalable animatable 3D modeling for arbitrary categories, which is incremental as it extends differentiable rendering to non-rigid categories.

The paper tackles the problem of building animatable 3D models from monocular videos for arbitrary categories like humans, cats, and dogs, achieving this by learning from 50-100 internet videos with methods to disentangle instance and motion variations.

Building animatable 3D models is challenging due to the need for 3D scans, laborious registration, and manual rigging, which are difficult to scale to arbitrary categories. Recently, differentiable rendering provides a pathway to obtain high-quality 3D models from monocular videos, but these are limited to rigid categories or single instances. We present RAC that builds category 3D models from monocular videos while disentangling variations over instances and motion over time. Three key ideas are introduced to solve this problem: (1) specializing a skeleton to instances via optimization, (2) a method for latent space regularization that encourages shared structure across a category while maintaining instance details, and (3) using 3D background models to disentangle objects from the background. We show that 3D models of humans, cats, and dogs can be learned from 50-100 internet videos.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes