DLCLMay 3, 2020

Gender Gap in Natural Language Processing Research: Disparities in Authorship and Citations

arXiv:2005.00962v21003 citations
AI Analysis

This highlights a persistent gender gap affecting female researchers in NLP, with implications for diversity and fairness in the field, though it is incremental as it builds on existing measurement methods.

The study measured gender disparities in authorship and citations in Natural Language Processing research from 1965 to 2019, finding that only about 29% of first authors and 25% of last authors are female, with no improvement since the mid-2000s, and that female first authors are cited less than males even after controlling for factors.

Disparities in authorship and citations across gender can have substantial adverse consequences not just on the disadvantaged genders, but also on the field of study as a whole. Measuring gender gaps is a crucial step towards addressing them. In this work, we examine female first author percentages and the citations to their papers in Natural Language Processing (1965 to 2019). We determine aggregate-level statistics using an existing manually curated author--gender list as well as first names strongly associated with a gender. We find that only about 29% of first authors are female and only about 25% of last authors are female. Notably, this percentage has not improved since the mid 2000s. We also show that, on average, female first authors are cited less than male first authors, even when controlling for experience and area of research. Finally, we discuss the ethical considerations involved in automatic demographic analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes