CVAIDec 6, 2021

Producing augmentation-invariant embeddings from real-life imagery

arXiv:2112.03415v28 citations
AI Analysis

This work addresses image similarity challenges for social media applications, but it is incremental as it builds on existing CNN and ArcFace methods.

The paper tackled the problem of generating augmentation-invariant embeddings from real-life social media images, achieving second place in the 2021 Facebook AI Image Similarity Challenge: Descriptor Track.

This article presents an efficient way to produce feature-rich, high-dimensionality embedding spaces from real-life images. The features produced are designed to be independent from augmentations used in real-life cases which appear on social media. Our approach uses convolutional neural networks (CNN) to produce an embedding space. An ArcFace head was used to train the model by employing automatically produced augmentations. Additionally, we present a way to make an ensemble out of different embeddings containing the same semantic information, a way to normalize the resulting embedding using an external dataset, and a novel way to perform quick training of these models with a high number of classes in the ArcFace head. Using this approach we achieved the 2nd place in the 2021 Facebook AI Image Similarity Challenge: Descriptor Track.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes