Speciesist Language and Nonhuman Animal Bias in English Masked Language Models
This addresses the problem of overlooked nonhuman biases in NLP for researchers and practitioners, though it is incremental as it extends existing bias analysis methods to a new domain.
The paper analyzed speciesist bias in English masked language models like BERT, finding that these models tend to associate harmful words with nonhuman animals and show a bias toward using speciesist language for some animal names, based on experiments with 46 animal names using template-based and corpus-extracted sentences.
Various existing studies have analyzed what social biases are inherited by NLP models. These biases may directly or indirectly harm people, therefore previous studies have focused only on human attributes. However, until recently no research on social biases in NLP regarding nonhumans existed. In this paper, we analyze biases to nonhuman animals, i.e. speciesist bias, inherent in English Masked Language Models such as BERT. We analyzed speciesist bias against 46 animal names using template-based and corpus-extracted sentences containing speciesist (or non-speciesist) language. We found that pre-trained masked language models tend to associate harmful words with nonhuman animals and have a bias toward using speciesist language for some nonhuman animal names. Our code for reproducing the experiments will be made available on GitHub.