Man is to Person as Woman is to Location: Measuring Gender Bias in Named Entity Recognition
This addresses bias in NER systems used in industry and academia, providing a new benchmark for evaluation, though it is incremental in nature.
The study measured gender bias in named entity recognition (NER) models, finding that female names are less recognized as PERSON entities than male names, based on an evaluation using 139 years of U.S. census baby names.
We study the bias in several state-of-the-art named entity recognition (NER) models---specifically, a difference in the ability to recognize male and female names as PERSON entity types. We evaluate NER models on a dataset containing 139 years of U.S. census baby names and find that relatively more female names, as opposed to male names, are not recognized as PERSON entities. We study the extent of this bias in several NER systems that are used prominently in industry and academia. In addition, we also report a bias in the datasets on which these models were trained. The result of this analysis yields a new benchmark for gender bias evaluation in named entity recognition systems. The data and code for the application of this benchmark will be publicly available for researchers to use.