CL LGMar 12, 2024

SIFiD: Reassess Summary Factual Inconsistency Detection with LLM

Jiuding Yang, Hui Liu, Weidong Guo, Zhuwei Rao, Yu Xu, Di Niu

arXiv:2403.07557v11 citationsh-index: 13KNOWLLM

Originality Incremental advance

AI Analysis

This work addresses the challenge of ensuring factual accuracy in automated summaries, which is crucial for applications like news aggregation and content generation, but it appears incremental as it builds on existing LLM-based detection efforts.

The paper tackles the problem of factual inconsistency detection in summarization by reassessing the use of Large Language Models (LLMs) like GPT-3.5 and GPT-4, which previously underperformed traditional models, and proposes SIFiD, a method that identifies key sentences in documents using natural language inference or semantic similarity to improve detection.

Ensuring factual consistency between the summary and the original document is paramount in summarization tasks. Consequently, considerable effort has been dedicated to detecting inconsistencies. With the advent of Large Language Models (LLMs), recent studies have begun to leverage their advanced language understanding capabilities for inconsistency detection. However, early attempts have shown that LLMs underperform traditional models due to their limited ability to follow instructions and the absence of an effective detection methodology. In this study, we reassess summary inconsistency detection with LLMs, comparing the performances of GPT-3.5 and GPT-4. To advance research in LLM-based inconsistency detection, we propose SIFiD (Summary Inconsistency Detection with Filtered Document) that identify key sentences within documents by either employing natural language inference or measuring semantic similarity between summaries and documents.

View on arXiv PDF

Similar