2nd Place Solution to Facebook AI Image Similarity Challenge Matching Track
This work addresses image similarity matching for AI applications, but it is incremental as it builds on existing self-supervised and ViT methods.
The paper tackled the problem of image similarity matching by proposing a self-supervised learning approach using Vision Transformers (ViT) that concatenates query and reference images to predict matches, achieving a score of 0.8291 Micro-average Precision on the private leaderboard.
This paper presents the 2nd place solution to the Facebook AI Image Similarity Challenge : Matching Track on DrivenData. The solution is based on self-supervised learning, and Vision Transformer(ViT). The main breaktrough comes from concatenating query and reference image to form as one image and asking ViT to directly predict from the image if query image used reference image. The solution scored 0.8291 Micro-average Precision on the private leaderboard.