CLApr 18, 2024

NormAd: A Framework for Measuring the Cultural Adaptability of Large Language Models

Abhinav Rao, Akhila Yerukola, Vishwa Shah, Katharina Reinecke, Maarten Sap

AI2CMU

arXiv:2404.12464v1019.841 citationsh-index: 35Has CodeNAACL

Originality Incremental advance

AI Analysis

This addresses the need for LLMs to adapt to diverse global cultures for safe deployment, though it is incremental as it focuses on evaluation rather than solving the adaptability issue.

The authors tackled the problem of assessing large language models' cultural adaptability by introducing NormAd, a framework to measure their ability to judge social acceptability across cultural norms, and found that LLMs struggle with accuracy, performing below 82% even in simple settings compared to humans over 95%.

To be effectively and safely deployed to global user populations, large language models (LLMs) may need to adapt outputs to user values and cultures, not just know about them. We introduce NormAd, an evaluation framework to assess LLMs' cultural adaptability, specifically measuring their ability to judge social acceptability across varying levels of cultural norm specificity, from abstract values to explicit social norms. As an instantiation of our framework, we create NormAd-Eti, a benchmark of 2.6k situational descriptions representing social-etiquette related cultural norms from 75 countries. Through comprehensive experiments on NormAd-Eti, we find that LLMs struggle to accurately judge social acceptability across these varying degrees of cultural contexts and show stronger adaptability to English-centric cultures over those from the Global South. Even in the simplest setting where the relevant social norms are provided, the best LLMs' performance (< 82\%) lags behind humans (> 95\%). In settings with abstract values and country information, model performance drops substantially (< 60\%), while human accuracy remains high (> 90\%). Furthermore, we find that models are better at recognizing socially acceptable versus unacceptable situations. Our findings showcase the current pitfalls in socio-cultural reasoning of LLMs which hinder their adaptability for global audiences.

View on arXiv PDF Code

Similar