CVIRLGDec 29, 2020

Image-to-Image Retrieval by Learning Similarity between Scene Graphs

arXiv:2012.14700v158 citations
AI Analysis

This work tackles the problem of image-to-image retrieval for users seeking semantically similar images, offering an incremental improvement over existing methods.

This paper addresses image-to-image retrieval by proposing a method that learns similarity between scene graphs using graph neural networks. The method is trained to predict image relevance based on human-annotated captions and a pre-trained sentence similarity model, demonstrating better agreement with human perception of image similarity compared to competitive baselines.

As a scene graph compactly summarizes the high-level content of an image in a structured and symbolic manner, the similarity between scene graphs of two images reflects the relevance of their contents. Based on this idea, we propose a novel approach for image-to-image retrieval using scene graph similarity measured by graph neural networks. In our approach, graph neural networks are trained to predict the proxy image relevance measure, computed from human-annotated captions using a pre-trained sentence similarity model. We collect and publish the dataset for image relevance measured by human annotators to evaluate retrieval algorithms. The collected dataset shows that our method agrees well with the human perception of image similarity than other competitive baselines.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes