CVJan 29

Variance & Greediness: A comparative study of metric-learning losses

Donghuo Zeng, Hao Niu, Zhi Li, Masato Taya

arXiv:2601.21450v11.5h-index: 1

Originality Synthesis-oriented

AI Analysis

This work provides practical guidance for selecting metric-learning losses in image retrieval tasks, though it is incremental as it compares existing methods rather than introducing new ones.

The authors analyzed how different metric-learning losses affect embedding geometry and optimization dynamics, finding that Triplet and SCL losses preserve higher within-class variance and clearer inter-class margins, leading to stronger top-1 retrieval in fine-grained settings, while Contrastive and InfoNCE losses achieve faster embedding compaction but may oversimplify class structures.

Metric learning is central to retrieval, yet its effects on embedding geometry and optimization dynamics are not well understood. We introduce a diagnostic framework, VARIANCE (intra-/inter-class variance) and GREEDINESS (active ratio and gradient norms), to compare seven representative losses, i.e., Contrastive, Triplet, N-pair, InfoNCE, ArcFace, SCL, and CCL, across five image-retrieval datasets. Our analysis reveals that Triplet and SCL preserve higher within-class variance and clearer inter-class margins, leading to stronger top-1 retrieval in fine-grained settings. In contrast, Contrastive and InfoNCE compact embeddings are achieved quickly through many small updates, accelerating convergence but potentially oversimplifying class structures. N-pair achieves a large mean separation but with uneven spacing. These insights reveal a form of efficiency-granularity trade-off and provide practical guidance: prefer Triplet/SCL when diversity preservation and hard-sample discrimination are critical, and Contrastive/InfoNCE when faster embedding compaction is desired.

View on arXiv PDF

Similar