CVMar 31, 2018

Tagging like Humans: Diverse and Distinct Image Annotation

arXiv:1804.00113v148 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of generating human-like annotations for images, which is incremental as it builds on existing methods like DPP and GANs to improve tag diversity.

The paper tackles automatic image annotation by proposing a generative model, D2IA, that produces semantically relevant, distinct, and diverse tags, outperforming state-of-the-art methods in diversity and distinctiveness as shown in experiments on two benchmark datasets.

In this work we propose a new automatic image annotation model, dubbed {\bf diverse and distinct image annotation} (D2IA). The generative model D2IA is inspired by the ensemble of human annotations, which create semantically relevant, yet distinct and diverse tags. In D2IA, we generate a relevant and distinct tag subset, in which the tags are relevant to the image contents and semantically distinct to each other, using sequential sampling from a determinantal point process (DPP) model. Multiple such tag subsets that cover diverse semantic aspects or diverse semantic levels of the image contents are generated by randomly perturbing the DPP sampling process. We leverage a generative adversarial network (GAN) model to train D2IA. Extensive experiments including quantitative and qualitative comparisons, as well as human subject studies, on two benchmark datasets demonstrate that the proposed model can produce more diverse and distinct tags than the state-of-the-arts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes