CVAug 22, 2019

Learning Similarity Conditions Without Explicit Supervision

arXiv:1908.08589v190 citations
AI Analysis

This addresses the limitation of requiring explicit labels for similarity conditions in tasks like fashion recommendation, allowing better generalization to new categories.

The paper tackles the problem of learning multiple similarity conditions (e.g., color, category, shape) without explicit supervision, enabling models to generalize to unseen categories. The proposed approach outperforms state-of-the-art supervised methods on tasks like fill-in-the-blank and outfit compatibility prediction across three datasets.

Many real-world tasks require models to compare images along multiple similarity conditions (e.g. similarity in color, category or shape). Existing methods often reason about these complex similarity relationships by learning condition-aware embeddings. While such embeddings aid models in learning different notions of similarity, they also limit their capability to generalize to unseen categories since they require explicit labels at test time. To address this deficiency, we propose an approach that jointly learns representations for the different similarity conditions and their contributions as a latent variable without explicit supervision. Comprehensive experiments across three datasets, Polyvore-Outfits, Maryland-Polyvore and UT-Zappos50k, demonstrate the effectiveness of our approach: our model outperforms the state-of-the-art methods, even those that are strongly supervised with pre-defined similarity conditions, on fill-in-the-blank, outfit compatibility prediction and triplet prediction tasks. Finally, we show that our model learns different visually-relevant semantic sub-spaces that allow it to generalize well to unseen categories.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes