CVJun 19, 2023

Primitive Generation and Semantic-related Alignment for Universal Zero-Shot Segmentation

arXiv:2306.11087v149 citationsh-index: 74Has Code
Originality Highly original
AI Analysis

It addresses the problem of segmenting unseen categories in computer vision, which is incremental as it builds on existing zero-shot segmentation methods with novel generative and alignment techniques.

The paper tackles universal zero-shot segmentation for novel categories without training samples by bridging semantic and visual spaces, achieving state-of-the-art performance on panoptic, instance, and semantic segmentation tasks.

We study universal zero-shot segmentation in this work to achieve panoptic, instance, and semantic segmentation for novel categories without any training samples. Such zero-shot segmentation ability relies on inter-class relationships in semantic space to transfer the visual knowledge learned from seen categories to unseen ones. Thus, it is desired to well bridge semantic-visual spaces and apply the semantic relationships to visual feature learning. We introduce a generative model to synthesize features for unseen categories, which links semantic and visual spaces as well as addresses the issue of lack of unseen training data. Furthermore, to mitigate the domain gap between semantic and visual spaces, firstly, we enhance the vanilla generator with learned primitives, each of which contains fine-grained attributes related to categories, and synthesize unseen features by selectively assembling these primitives. Secondly, we propose to disentangle the visual feature into the semantic-related part and the semantic-unrelated part that contains useful visual classification clues but is less relevant to semantic representation. The inter-class relationships of semantic-related visual features are then required to be aligned with those in semantic space, thereby transferring semantic knowledge to visual feature learning. The proposed approach achieves impressively state-of-the-art performance on zero-shot panoptic segmentation, instance segmentation, and semantic segmentation. Code is available at https://henghuiding.github.io/PADing/.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes