CVAILGApr 27, 2023

Analogy-Forming Transformers for Few-Shot 3D Parsing

arXiv:2304.14382v22 citationsh-index: 45
Originality Highly original
AI Analysis

This addresses the problem of few-shot 3D parsing for computer vision applications, offering a novel approach that improves performance in low-data scenarios.

The paper tackles 3D object segmentation in few-shot settings by introducing Analogical Networks, which use analogical reasoning with retrieved memory scenes to predict part structures, outperforming state-of-the-art methods and enabling segmentation of novel categories without weight updates.

We present Analogical Networks, a model that encodes domain knowledge explicitly, in a collection of structured labelled 3D scenes, in addition to implicitly, as model parameters, and segments 3D object scenes with analogical reasoning: instead of mapping a scene to part segments directly, our model first retrieves related scenes from memory and their corresponding part structures, and then predicts analogous part structures for the input scene, via an end-to-end learnable modulation mechanism. By conditioning on more than one retrieved memories, compositions of structures are predicted, that mix and match parts across the retrieved memories. One-shot, few-shot or many-shot learning are treated uniformly in Analogical Networks, by conditioning on the appropriate set of memories, whether taken from a single, few or many memory exemplars, and inferring analogous parses. We show Analogical Networks are competitive with state-of-the-art 3D segmentation transformers in many-shot settings, and outperform them, as well as existing paradigms of meta-learning and few-shot learning, in few-shot settings. Analogical Networks successfully segment instances of novel object categories simply by expanding their memory, without any weight updates. Our code and models are publicly available in the project webpage: http://analogicalnets.github.io/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes