CLJul 2, 2021

DRIFT: A Toolkit for Diachronic Analysis of Scientific Literature

Abheesht Sharma, Gunjan Chhablani, Harshit Pandey, Rajaswa Patil

arXiv:2107.01198v530.7664 citationsh-index: 8Has Code

Originality Synthesis-oriented

AI Analysis

This provides researchers with an easy-to-use tool for tracking research trends, but it is incremental as it collates existing methods with some additions.

The authors tackled the problem of analyzing scientific literature over time by developing DRIFT, a toolkit for diachronic analysis, and demonstrated its utility through a case study on the cs.CL corpus of arXiv, showing it can track trends like keyword extraction and semantic drift.

In this work, we present to the NLP community, and to the wider research community as a whole, an application for the diachronic analysis of research corpora. We open source an easy-to-use tool coined: DRIFT, which allows researchers to track research trends and development over the years. The analysis methods are collated from well-cited research works, with a few of our own methods added for good measure. Succinctly put, some of the analysis methods are: keyword extraction, word clouds, predicting declining/stagnant/growing trends using Productivity, tracking bi-grams using Acceleration plots, finding the Semantic Drift of words, tracking trends using similarity, etc. To demonstrate the utility and efficacy of our tool, we perform a case study on the cs.CL corpus of the arXiv repository and draw inferences from the analysis methods. The toolkit and the associated code are available here: https://github.com/rajaswa/DRIFT.

View on arXiv PDF Code

Similar