CVMay 15, 2015

Discovering Attribute Shades of Meaning with the Crowd

arXiv:1505.04117v140 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of vague visual definitions for attributes in computer vision, offering personalized models for tasks like image search and zero-shot learning, though it is incremental in refining existing methods.

The paper tackles the problem of semantic attribute learning by discovering latent factors in crowdsourced image labels to identify different interpretations of an attribute, leading to improved attribute prediction accuracy and more successful attribute-based image search.

To learn semantic attributes, existing methods typically train one discriminative model for each word in a vocabulary of nameable properties. However, this "one model per word" assumption is problematic: while a word might have a precise linguistic definition, it need not have a precise visual definition. We propose to discover shades of attribute meaning. Given an attribute name, we use crowdsourced image labels to discover the latent factors underlying how different annotators perceive the named concept. We show that structure in those latent factors helps reveal shades, that is, interpretations for the attribute shared by some group of annotators. Using these shades, we train classifiers to capture the primary (often subtle) variants of the attribute. The resulting models are both semantic and visually precise. By catering to users' interpretations, they improve attribute prediction accuracy on novel images. Shades also enable more successful attribute-based image search, by providing robust personalized models for retrieving multi-attribute query results. They are widely applicable to tasks that involve describing visual content, such as zero-shot category learning and organization of photo collections.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes