Akihiro Nishi

2papers

2 Papers

CLOct 15, 2021
Socially Aware Bias Measurements for Hindi Language Representations

Vijit Malik, Sunipa Dev, Akihiro Nishi et al.

Language representations are efficient tools used across NLP applications, but they are strife with encoded societal biases. These biases are studied extensively, but with a primary focus on English language representations and biases common in the context of Western society. In this work, we investigate biases present in Hindi language representations with focuses on caste and religion-associated biases. We demonstrate how biases are unique to specific language representations based on the history and culture of the region they are widely spoken in, and how the same societal bias (such as binary gender-associated biases) is encoded by different words and text spans across languages. The discoveries of our work highlight the necessity of culture awareness and linguistic artifacts when modeling language representations, in order to better understand the encoded biases.

CLAug 7, 2021
On Measures of Biases and Harms in NLP

Sunipa Dev, Emily Sheng, Jieyu Zhao et al.

Recent studies show that Natural Language Processing (NLP) technologies propagate societal biases about demographic groups associated with attributes such as gender, race, and nationality. To create interventions and mitigate these biases and associated harms, it is vital to be able to detect and measure such biases. While existing works propose bias evaluation and mitigation methods for various tasks, there remains a need to cohesively understand the biases and the specific harms they measure, and how different measures compare with each other. To address this gap, this work presents a practical framework of harms and a series of questions that practitioners can answer to guide the development of bias measures. As a validation of our framework and documentation questions, we also present several case studies of how existing bias measures in NLP -- both intrinsic measures of bias in representations and extrinsic measures of bias of downstream applications -- can be aligned with different harms and how our proposed documentation questions facilitates more holistic understanding of what bias measures are measuring.