Towards Region-aware Bias Evaluation Metrics
This work addresses the need for more culturally sensitive bias evaluation metrics in NLP, though it is incremental as it builds on existing methods like WEAT.
The paper tackles the problem of evaluating gender bias in language models by identifying region-specific bias dimensions, proposing a region-aware bottom-up approach that generates topic pairs aligned with local societal biases, and finds that these pairs often match human perception better than existing metrics and reveal higher bias alignment in well-represented regions.
When exposed to human-generated data, language models are known to learn and amplify societal biases. While previous works introduced benchmarks that can be used to assess the bias in these models, they rely on assumptions that may not be universally true. For instance, a gender bias dimension commonly used by these metrics is that of family--career, but this may not be the only common bias in certain regions of the world. In this paper, we identify topical differences in gender bias across different regions and propose a region-aware bottom-up approach for bias assessment. Our proposed approach uses gender-aligned topics for a given region and identifies gender bias dimensions in the form of topic pairs that are likely to capture gender societal biases. Several of our proposed bias topic pairs are on par with human perception of gender biases in these regions in comparison to the existing ones, and we also identify new pairs that are more aligned than the existing ones. In addition, we use our region-aware bias topic pairs in a Word Embedding Association Test (WEAT)-based evaluation metric to test for gender biases across different regions in different data domains. We also find that LLMs have a higher alignment to bias pairs for highly-represented regions showing the importance of region-aware bias evaluation metric.