CLAIMar 6, 2024

German also Hallucinates! Inconsistency Detection in News Summaries with the Absinth Dataset

ETH Zurich
arXiv:2403.03750v282 citationsh-index: 17Has CodeLREC
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of hallucination detection in German news summarization for NLP researchers, but it is incremental as it extends existing methods to a new language.

The authors tackled the lack of German data for detecting hallucinations in news summaries by creating the manually annotated Absinth dataset, and they explored open-source LLMs on this task, achieving results like a 0.85 F1 score in fine-tuning.

The advent of Large Language Models (LLMs) has led to remarkable progress on a wide range of natural language processing tasks. Despite the advances, these large-sized models still suffer from hallucinating information in their output, which poses a major issue in automatic text summarization, as we must guarantee that the generated summary is consistent with the content of the source document. Previous research addresses the challenging task of detecting hallucinations in the output (i.e. inconsistency detection) in order to evaluate the faithfulness of the generated summaries. However, these works primarily focus on English and recent multilingual approaches lack German data. This work presents absinth, a manually annotated dataset for hallucination detection in German news summarization and explores the capabilities of novel open-source LLMs on this task in both fine-tuning and in-context learning settings. We open-source and release the absinth dataset to foster further research on hallucination detection in German.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes