CLIRFeb 21, 2018

Matching Article Pairs with Graphical Decomposition and Convolutions

arXiv:1802.07459v21114 citations
AI Analysis

This addresses the challenge of matching longer documents for tasks like news aggregation, though it is incremental as it builds on existing graph and encoding techniques.

The paper tackled the problem of matching longer articles, such as identifying if two articles describe the same breaking news, by proposing a Concept Interaction Graph to represent articles as graphs of concepts and using graph convolutional networks for matching. The result showed significant improvements over state-of-the-art methods on two new datasets of about 30K article pairs each.

Identifying the relationship between two articles, e.g., whether two articles published from different sources describe the same breaking news, is critical to many document understanding tasks. Existing approaches for modeling and matching sentence pairs do not perform well in matching longer documents, which embody more complex interactions between the enclosed entities than a sentence does. To model article pairs, we propose the Concept Interaction Graph to represent an article as a graph of concepts. We then match a pair of articles by comparing the sentences that enclose the same concept vertex through a series of encoding techniques, and aggregate the matching signals through a graph convolutional network. To facilitate the evaluation of long article matching, we have created two datasets, each consisting of about 30K pairs of breaking news articles covering diverse topics in the open domain. Extensive evaluations of the proposed methods on the two datasets demonstrate significant improvements over a wide range of state-of-the-art methods for natural language matching.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes