CVROSep 13, 2022

Multiple View Performers for Shape Completion

arXiv:2209.06291v11 citationsh-index: 34
Originality Incremental advance
AI Analysis

This addresses shape completion for 3D reconstruction tasks, offering a novel method that avoids registration and uses causal Transformers, but it appears incremental as it builds on existing Performers and memory techniques.

The paper tackles 3D shape completion from sequential views by proposing the Multiple View Performer (MVP), a linear-attention Transformer architecture that uses compressed memory for efficient attention to past observations, achieving generalization gains compared to baselines.

We propose the Multiple View Performer (MVP) - a new architecture for 3D shape completion from a series of temporally sequential views. MVP accomplishes this task by using linear-attention Transformers called Performers. Our model allows the current observation of the scene to attend to the previous ones for more accurate infilling. The history of past observations is compressed via the compact associative memory approximating modern continuous Hopfield memory, but crucially of size independent from the history length. We compare our model with several baselines for shape completion over time, demonstrating the generalization gains that MVP provides. To the best of our knowledge, MVP is the first multiple view voxel reconstruction method that does not require registration of multiple depth views and the first causal Transformer based model for 3D shape completion.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes