Improved Embeddings with Easy Positive Triplet Mining
It addresses the challenge of improving generalization in deep metric learning for image retrieval, though it appears incremental as it modifies existing mining strategies rather than introducing a new paradigm.
The paper tackles the problem of learning image embeddings for retrieval by proposing a loosened strategy that only requires mapping each training image to the most similar examples from the same class, called Easy Positive mining. This approach yields recall performance that exceeds state-of-the-art methods on multiple datasets, such as CUB and Stanford Online Products.
Deep metric learning seeks to define an embedding where semantically similar images are embedded to nearby locations, and semantically dissimilar images are embedded to distant locations. Substantial work has focused on loss functions and strategies to learn these embeddings by pushing images from the same class as close together in the embedding space as possible. In this paper, we propose an alternative, loosened embedding strategy that requires the embedding function only map each training image to the most similar examples from the same class, an approach we call "Easy Positive" mining. We provide a collection of experiments and visualizations that highlight that this Easy Positive mining leads to embeddings that are more flexible and generalize better to new unseen data. This simple mining strategy yields recall performance that exceeds state of the art approaches (including those with complicated loss functions and ensemble methods) on image retrieval datasets including CUB, Stanford Online Products, In-Shop Clothes and Hotels-50K.