Examining Citations of Natural Language Processing Literature
This work provides insights into the growth and impact of NLP research, helping researchers and institutions quantify paper influence, but it is incremental as it applies existing bibliometric methods to a specific domain.
The study analyzed citation trends in NLP literature using data from the ACL Anthology and Google Scholar, finding that only about 56% of papers are cited ten or more times, with long papers receiving almost three times as many citations as short papers.
We extracted information from the ACL Anthology (AA) and Google Scholar (GS) to examine trends in citations of NLP papers. We explore questions such as: how well cited are papers of different types (journal articles, conference papers, demo papers, etc.)? how well cited are papers from different areas of within NLP? etc. Notably, we show that only about 56\% of the papers in AA are cited ten or more times. CL Journal has the most cited papers, but its citation dominance has lessened in recent years. On average, long papers get almost three times as many citations as short papers; and papers on sentiment classification, anaphora resolution, and entity recognition have the highest median citations. The analyses presented here, and the associated dataset of NLP papers mapped to citations, have a number of uses including: understanding how the field is growing and quantifying the impact of different types of papers.