NormDial: A Comparable Bilingual Synthetic Dialog Dataset for Modeling Social Norm Adherence and Violation
This work addresses the challenge of understanding social norms in cross-cultural conversational contexts for researchers in computational linguistics and AI ethics.
The authors tackled the problem of modeling social norm adherence and violation in conversations by creating NormDial, a bilingual synthetic dialogue dataset with turn-by-turn annotations for Chinese and American cultures. They showed that the generated dialogues are of high quality through human evaluation and evaluated existing large language models on the new task of social norm observance detection.
Social norms fundamentally shape interpersonal communication. We present NormDial, a high-quality dyadic dialogue dataset with turn-by-turn annotations of social norm adherences and violations for Chinese and American cultures. Introducing the task of social norm observance detection, our dataset is synthetically generated in both Chinese and English using a human-in-the-loop pipeline by prompting large language models with a small collection of expert-annotated social norms. We show that our generated dialogues are of high quality through human evaluation and further evaluate the performance of existing large language models on this task. Our findings point towards new directions for understanding the nuances of social norms as they manifest in conversational contexts that span across languages and cultures.