CVMar 1, 2023

Distilled Reverse Attention Network for Open-world Compositional Zero-Shot Learning

arXiv:2303.00404v123 citationsh-index: 72
Originality Incremental advance
AI Analysis

This addresses the challenge of recognizing new compositions of attributes and objects in open-world scenarios, which is incremental as it builds on prior work by modeling contextuality and locality separately.

The paper tackles the problem of Open-World Compositional Zero-Shot Learning (OW-CZSL), where methods degrade due to unconstrained test spaces, by proposing a Distilled Reverse Attention Network that learns disentangled representations, achieving state-of-the-art performance on three datasets.

Open-World Compositional Zero-Shot Learning (OW-CZSL) aims to recognize new compositions of seen attributes and objects. In OW-CZSL, methods built on the conventional closed-world setting degrade severely due to the unconstrained OW test space. While previous works alleviate the issue by pruning compositions according to external knowledge or correlations in seen pairs, they introduce biases that harm the generalization. Some methods thus predict state and object with independently constructed and trained classifiers, ignoring that attributes are highly context-dependent and visually entangled with objects. In this paper, we propose a novel Distilled Reverse Attention Network to address the challenges. We also model attributes and objects separately but with different motivations, capturing contextuality and locality, respectively. We further design a reverse-and-distill strategy that learns disentangled representations of elementary components in training data supervised by reverse attention and knowledge distillation. We conduct experiments on three datasets and consistently achieve state-of-the-art (SOTA) performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes