CVSep 27, 2019

A weakly supervised adaptive triplet loss for deep metric learning

arXiv:1909.12939v126 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of learning effective image embeddings for cross-domain visual similarity search with reduced annotation effort, though it is incremental as it builds on existing triplet loss methods.

The paper tackles the problem of distance metric learning for visual similarity search by proposing a weakly supervised adaptive triplet loss (ATL) that captures fine-grained semantic similarity and improves generalization on cross-domain data, boosting performance by 10.6% over a baseline and outperforming state-of-the-art models on benchmarks like Amazon fashion retrieval and DeepFashion.

We address the problem of distance metric learning in visual similarity search, defined as learning an image embedding model which projects images into Euclidean space where semantically and visually similar images are closer and dissimilar images are further from one another. We present a weakly supervised adaptive triplet loss (ATL) capable of capturing fine-grained semantic similarity that encourages the learned image embedding models to generalize well on cross-domain data. The method uses weakly labeled product description data to implicitly determine fine grained semantic classes, avoiding the need to annotate large amounts of training data. We evaluate on the Amazon fashion retrieval benchmark and DeepFashion in-shop retrieval data. The method boosts the performance of triplet loss baseline by 10.6% on cross-domain data and out-performs the state-of-art model on all evaluation metrics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes