Common Sense vs. Morality: The Curious Case of Narrative Focus Bias in LLMs
This addresses a critical limitation for users relying on LLMs in diverse applications, highlighting a need for improved commonsense robustness, though it is incremental as it identifies and analyzes a specific bias without proposing a new training method.
The paper tackles the problem of LLMs prioritizing moral reasoning over commonsense understanding by introducing CoMoral, a benchmark dataset with commonsense contradictions in moral dilemmas, and finds that models consistently struggle to identify these contradictions without prior signal, showing a bias where contradictions are more detectable in secondary characters than narrators.
Large Language Models (LLMs) are increasingly deployed across diverse real-world applications and user communities. As such, it is crucial that these models remain both morally grounded and knowledge-aware. In this work, we uncover a critical limitation of current LLMs -- their tendency to prioritize moral reasoning over commonsense understanding. To investigate this phenomenon, we introduce CoMoral, a novel benchmark dataset containing commonsense contradictions embedded within moral dilemmas. Through extensive evaluation of ten LLMs across different model sizes, we find that existing models consistently struggle to identify such contradictions without prior signal. Furthermore, we observe a pervasive narrative focus bias, wherein LLMs more readily detect commonsense contradictions when they are attributed to a secondary character rather than the primary (narrator) character. Our comprehensive analysis underscores the need for enhanced reasoning-aware training to improve the commonsense robustness of large language models.