CLDec 24, 2022

A Comprehensive Study of Gender Bias in Chemical Named Entity Recognition Models

arXiv:2212.12799v27.030 citationsh-index: 16

Originality Incremental advance

AI Analysis

This work addresses potential harm from performance disparities in chemical NER models, which are used in tasks like adverse drug reaction identification, by highlighting biases that could affect healthcare applications.

The paper assessed gender-related performance disparities in chemical named entity recognition (NER) models, revealing biases such as female-related names being misclassified as chemicals and systems failing to detect contraceptives like birth control.

Chemical named entity recognition (NER) models are used in many downstream tasks, from adverse drug reaction identification to pharmacoepidemiology. However, it is unknown whether these models work the same for everyone. Performance disparities can potentially cause harm rather than the intended good. This paper assesses gender-related performance disparities in chemical NER systems. We develop a framework for measuring gender bias in chemical NER models using synthetic data and a newly annotated corpus of over 92,405 words with self-identified gender information from Reddit. Our evaluation of multiple biomedical NER models reveals evident biases. For instance, synthetic data suggests female-related names are frequently misclassified as chemicals, especially for brand name mentions. Additionally, we observe performance disparities between female- and male-associated data in both datasets. Many systems fail to detect contraceptives such as birth control. Our findings emphasize the biases in chemical NER models, urging practitioners to account for these biases in downstream applications.

View on arXiv PDF

Similar