CL AINov 18, 2024

Mitigating Knowledge Conflicts in Language Model-Driven Question Answering

Han Cao, Zhaoyang Zhang, Xiangtian Li, Chufan Wu, Hansong Zhang, Wenqing Zhang

arXiv:2411.11344v31.98 citationsh-index: 62024 6th International Academic Exchange Conference on Science and Technology Innovation (IAECST)

Originality Synthesis-oriented

AI Analysis

This addresses a specific issue in knowledge-driven tasks like QA, but appears incremental as it builds on known challenges without broad SOTA claims.

The paper tackles the problem of knowledge conflicts in language models for question answering, where misalignment between model knowledge and training data leads to hallucinations, and proposes a strategy to reduce these by linking source inputs to outputs.

In the context of knowledge-driven seq-to-seq generation tasks, such as document-based question answering and document summarization systems, two fundamental knowledge sources play crucial roles: the inherent knowledge embedded within model parameters and the external knowledge obtained through context. Recent studies revealed a significant challenge: when there exists a misalignment between the model's inherent knowledge and the ground truth answers in training data, the system may exhibit problematic behaviors during inference, such as ignoring input context, or generating unfaithful content. Our investigation proposes a strategy to minimize hallucination by building explicit connection between source inputs and generated outputs. We specifically target a common hallucination pattern in question answering, examining how the correspondence between entities and their contexts during model training influences the system's performance at inference time.

View on arXiv PDF

Similar