MMCVMar 10, 2022

Two-stream Hierarchical Similarity Reasoning for Image-text Matching

arXiv:2203.05349v110 citationsh-index: 83
Originality Incremental advance
AI Analysis

This work improves image-text matching for applications like image retrieval and captioning, but it is incremental as it builds on existing reasoning-based approaches.

The paper tackled the problem of image-text matching by addressing the lack of multi-level hierarchical similarity information and single-stream similarity alignment, proposing a two-stream hierarchical similarity reasoning network that achieved state-of-the-art results on MSCOCO and Flickr30K datasets.

Reasoning-based approaches have demonstrated their powerful ability for the task of image-text matching. In this work, two issues are addressed for image-text matching. First, for reasoning processing, conventional approaches have no ability to find and use multi-level hierarchical similarity information. To solve this problem, a hierarchical similarity reasoning module is proposed to automatically extract context information, which is then co-exploited with local interaction information for efficient reasoning. Second, previous approaches only consider learning single-stream similarity alignment (i.e., image-to-text level or text-to-image level), which is inadequate to fully use similarity information for image-text matching. To address this issue, a two-stream architecture is developed to decompose image-text matching into image-to-text level and text-to-image level similarity computation. These two issues are investigated by a unifying framework that is trained in an end-to-end manner, namely two-stream hierarchical similarity reasoning network. The extensive experiments performed on the two benchmark datasets of MSCOCO and Flickr30K show the superiority of the proposed approach as compared to existing state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes