CVAug 7, 2019

Location Field Descriptors: Single Image 3D Model Retrieval in the Wild

Alexander Grabner, Peter M. Roth, Vincent Lepetit

arXiv:1908.02853v19.440 citations

Originality Highly original

AI Analysis

This work addresses the problem of retrieving 3D models from single images in unconstrained environments for applications in computer vision and graphics, representing a novel method rather than an incremental improvement.

The paper tackles single image 3D model retrieval in the wild by introducing Location Field Descriptors, which encode correspondences between 2D pixels and 3D surface coordinates to capture shape and pose without appearance variations, and it significantly outperforms state-of-the-art methods by up to 20% absolute on multiple metrics across three real-world datasets.

We present Location Field Descriptors, a novel approach for single image 3D model retrieval in the wild. In contrast to previous methods that directly map 3D models and RGB images to an embedding space, we establish a common low-level representation in the form of location fields from which we compute pose invariant 3D shape descriptors. Location fields encode correspondences between 2D pixels and 3D surface coordinates and, thus, explicitly capture 3D shape and 3D pose information without appearance variations which are irrelevant for the task. This early fusion of 3D models and RGB images results in three main advantages: First, the bottleneck location field prediction acts as a regularizer during training. Second, major parts of the system benefit from training on a virtually infinite amount of synthetic data. Finally, the predicted location fields are visually interpretable and unblackbox the system. We evaluate our proposed approach on three challenging real-world datasets (Pix3D, Comp, and Stanford) with different object categories and significantly outperform the state-of-the-art by up to 20% absolute in multiple 3D retrieval metrics.

View on arXiv PDF

Similar