Adaptable Moral Stances of Large Language Models on Sexist Content: Implications for Society and Gender Discourse
This research highlights the dual capacity of LLMs to both justify and help understand sexist language, posing a significant challenge for developers and users of AI systems in sensitive societal contexts.
This paper investigates how eight large language models (LLMs) apply moral reasoning to both criticize and defend sexist language, demonstrating their ability to provide comprehensible and contextually relevant explanations grounded in diverse moral perspectives. The study found that all eight models could generate arguments for both critiquing and endorsing sexist views, with some models aligning more with progressive or conservative ideologies.
This work provides an explanatory view of how LLMs can apply moral reasoning to both criticize and defend sexist language. We assessed eight large language models, all of which demonstrated the capability to provide explanations grounded in varying moral perspectives for both critiquing and endorsing views that reflect sexist assumptions. With both human and automatic evaluation, we show that all eight models produce comprehensible and contextually relevant text, which is helpful in understanding diverse views on how sexism is perceived. Also, through analysis of moral foundations cited by LLMs in their arguments, we uncover the diverse ideological perspectives in models' outputs, with some models aligning more with progressive or conservative views on gender roles and sexism. Based on our observations, we caution against the potential misuse of LLMs to justify sexist language. We also highlight that LLMs can serve as tools for understanding the roots of sexist beliefs and designing well-informed interventions. Given this dual capacity, it is crucial to monitor LLMs and design safety mechanisms for their use in applications that involve sensitive societal topics, such as sexism.