CLApr 28, 2025

Conflicts in Texts: Data, Implications and Challenges

arXiv:2504.19472v12 citationsh-index: 3EMNLP

Originality Synthesis-oriented

AI Analysis

It addresses the issue of model unreliability due to conflicts for NLP practitioners and researchers, but it is incremental as it synthesizes existing work into a unified framework.

This survey tackles the problem of NLP models relying on and generating conflicting information, categorizing conflicts into natural texts, human-annotated data, and model interactions, and discusses mitigation strategies to improve reliability.

As NLP models become increasingly integrated into real-world applications, it becomes clear that there is a need to address the fact that models often rely on and generate conflicting information. Conflicts could reflect the complexity of situations, changes that need to be explained and dealt with, difficulties in data annotation, and mistakes in generated outputs. In all cases, disregarding the conflicts in data could result in undesired behaviors of models and undermine NLP models' reliability and trustworthiness. This survey categorizes these conflicts into three key areas: (1) natural texts on the web, where factual inconsistencies, subjective biases, and multiple perspectives introduce contradictions; (2) human-annotated data, where annotator disagreements, mistakes, and societal biases impact model training; and (3) model interactions, where hallucinations and knowledge conflicts emerge during deployment. While prior work has addressed some of these conflicts in isolation, we unify them under the broader concept of conflicting information, analyze their implications, and discuss mitigation strategies. We highlight key challenges and future directions for developing conflict-aware NLP systems that can reason over and reconcile conflicting information more effectively.

View on arXiv PDF

Similar