CVGRLGAug 11, 2020

GeLaTO: Generative Latent Textured Objects

arXiv:2008.04852v113 citations
AI Analysis

This addresses the challenge of 3D reconstruction for complex objects like eyeglasses and cars, which is incremental as it builds on existing proxy-based methods in computer graphics.

The paper tackles the problem of accurately modeling 3D objects with transparency, reflections, and thin structures by proposing GeLaTO, a compact representation combining coarse shape proxies with neural textures, which enables reconstruction from sparse views and demonstrates results on challenging real-world datasets like eyeglasses frames.

Accurate modeling of 3D objects exhibiting transparency, reflections and thin structures is an extremely challenging problem. Inspired by billboards and geometric proxies used in computer graphics, this paper proposes Generative Latent Textured Objects (GeLaTO), a compact representation that combines a set of coarse shape proxies defining low frequency geometry with learned neural textures, to encode both medium and fine scale geometry as well as view-dependent appearance. To generate the proxies' textures, we learn a joint latent space allowing category-level appearance and geometry interpolation. The proxies are independently rasterized with their corresponding neural texture and composited using a U-Net, which generates an output photorealistic image including an alpha map. We demonstrate the effectiveness of our approach by reconstructing complex objects from a sparse set of views. We show results on a dataset of real images of eyeglasses frames, which are particularly challenging to reconstruct using classical methods. We also demonstrate that these coarse proxies can be handcrafted when the underlying object geometry is easy to model, like eyeglasses, or generated using a neural network for more complex categories, such as cars.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes