CVNov 17, 2022

Targeted Attention for Generalized- and Zero-Shot Learning

arXiv:2211.09322v1h-index: 8
Originality Incremental advance
AI Analysis

This work addresses the challenge of learning concepts without labeled data for tasks like unsupervised concept learning and domain adaptation, but it is incremental as it builds on existing methods from related tasks.

The paper tackles the problem of Zero-Shot Learning (ZSL) by proposing a method that combines approaches from person re-identification with modifications to avoid the need for feature or training dataset augmentation, achieving state-of-the-art performance with NMI of 63.27 and top-1 of 61.04 on CUB200, and NMI 66.03 with top-1 82.75% on Cars196.

The Zero-Shot Learning (ZSL) task attempts to learn concepts without any labeled data. Unlike traditional classification/detection tasks, the evaluation environment is provided unseen classes never encountered during training. As such, it remains both challenging, and promising on a variety of fronts, including unsupervised concept learning, domain adaptation, and dataset drift detection. Recently, there have been a variety of approaches towards solving ZSL, including improved metric learning methods, transfer learning, combinations of semantic and image domains using, e.g. word vectors, and generative models to model the latent space of known classes to classify unseen classes. We find many approaches require intensive training augmentation with attributes or features that may be commonly unavailable (attribute-based learning) or susceptible to adversarial attacks (generative learning). We propose combining approaches from the related person re-identification task for ZSL, with key modifications to ensure sufficiently improved performance in the ZSL setting without the need for feature or training dataset augmentation. We are able to achieve state-of-the-art performance on the CUB200 and Cars196 datasets in the ZSL setting compared to recent works, with NMI (normalized mutual inference) of 63.27 and top-1 of 61.04 for CUB200, and NMI 66.03 with top-1 82.75% in Cars196. We also show state-of-the-art results in the Generalized Zero-Shot Learning (GZSL) setting, with Harmonic Mean R-1 of 66.14% on the CUB200 dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes