Re-contextualizing Fairness in NLP: The Case of India
This work tackles fairness in NLP for India, highlighting biases in models and corpora, but it is incremental as it extends existing fairness frameworks to a new geo-cultural setting.
The paper addresses the lack of fairness evaluation in NLP for non-Western contexts by focusing on India, building resources to demonstrate prediction biases along social axes like region and religion, and outlining a research agenda to adapt fairness research to the Indian context.
Recent research has revealed undesirable biases in NLP data and models. However, these efforts focus on social disparities in West, and are not directly portable to other geo-cultural contexts. In this paper, we focus on NLP fair-ness in the context of India. We start with a brief account of the prominent axes of social disparities in India. We build resources for fairness evaluation in the Indian context and use them to demonstrate prediction biases along some of the axes. We then delve deeper into social stereotypes for Region andReligion, demonstrating its prevalence in corpora and models. Finally, we outline a holistic research agenda to re-contextualize NLP fairness research for the Indian context, ac-counting for Indian societal context, bridging technological gaps in NLP capabilities and re-sources, and adapting to Indian cultural values. While we focus on India, this framework can be generalized to other geo-cultural contexts.