How Can Graph Neural Networks Help Document Retrieval: A Case Study on CORD19 with Concept Map Generation
This work provides incremental insights for researchers in information retrieval by offering guidelines for developing effective GNNs with semantics-oriented inductive biases for textual reasoning tasks.
The study investigated how graph neural networks (GNNs) can improve document retrieval by testing them on the CORD-19 dataset, finding that semantics-oriented graph functions outperformed complex GNNs like GINs and GATs in terms of better and more stable performance based on BM25 candidates.
Graph neural networks (GNNs), as a group of powerful tools for representation learning on irregular data, have manifested superiority in various downstream tasks. With unstructured texts represented as concept maps, GNNs can be exploited for tasks like document retrieval. Intrigued by how can GNNs help document retrieval, we conduct an empirical study on a large-scale multi-discipline dataset CORD-19. Results show that instead of the complex structure-oriented GNNs such as GINs and GATs, our proposed semantics-oriented graph functions achieve better and more stable performance based on the BM25 retrieved candidates. Our insights in this case study can serve as a guideline for future work to develop effective GNNs with appropriate semantics-oriented inductive biases for textual reasoning tasks like document retrieval and classification. All code for this case study is available at https://github.com/HennyJie/GNN-DocRetrieval.