CL AI LGApr 18, 2021

On the Use of Context for Predicting Citation Worthiness of Sentences in Scholarly Articles

Rakesh Gosangi, Ravneet Arora, Mohsen Gheisarieha, Debanjan Mahata, Haimin Zhang

arXiv:2104.08962v131.7728 citations

Originality Synthesis-oriented

AI Analysis

This work addresses a domain-specific problem for researchers and scholars by improving citation prediction, but it is incremental as it builds on existing sequence labeling methods.

The paper tackled the problem of predicting citation worthiness of sentences in scholarly articles by studying the importance of context, resulting in quantified benefits from using contextual information and embeddings, with evaluation on three benchmark datasets.

In this paper, we study the importance of context in predicting the citation worthiness of sentences in scholarly articles. We formulate this problem as a sequence labeling task solved using a hierarchical BiLSTM model. We contribute a new benchmark dataset containing over two million sentences and their corresponding labels. We preserve the sentence order in this dataset and perform document-level train/test splits, which importantly allows incorporating contextual information in the modeling process. We evaluate the proposed approach on three benchmark datasets. Our results quantify the benefits of using context and contextual embeddings for citation worthiness. Lastly, through error analysis, we provide insights into cases where context plays an essential role in predicting citation worthiness.

View on arXiv PDF

Similar