CVDec 9, 2022

Seeing a Rose in Five Thousand Ways

DeepMindStanford
arXiv:2212.04965v214 citationsh-index: 77
Originality Highly original
AI Analysis

This addresses the challenge of 3D understanding and generation from limited data for computer vision applications, representing a novel method rather than an incremental improvement.

The paper tackles the problem of learning object intrinsics (geometry, texture, material) from a single image containing multiple instances, achieving superior results on tasks like intrinsic image decomposition, shape generation, view synthesis, and relighting.

What is a rose, visually? A rose comprises its intrinsics, including the distribution of geometry, texture, and material specific to its object category. With knowledge of these intrinsic properties, we may render roses of different sizes and shapes, in different poses, and under different lighting conditions. In this work, we build a generative model that learns to capture such object intrinsics from a single image, such as a photo of a bouquet. Such an image includes multiple instances of an object type. These instances all share the same intrinsics, but appear different due to a combination of variance within these intrinsics and differences in extrinsic factors, such as pose and illumination. Experiments show that our model successfully learns object intrinsics (distribution of geometry, texture, and material) for a wide range of objects, each from a single Internet image. Our method achieves superior results on multiple downstream tasks, including intrinsic image decomposition, shape and image generation, view synthesis, and relighting.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes