CLMay 17, 2025

CCNU at SemEval-2025 Task 3: Leveraging Internal and External Knowledge of Large Language Models for Multilingual Hallucination Annotation

arXiv:2505.11965v11 citationsh-index: 11Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of hallucination detection in QA systems across 14 languages, which is an incremental improvement for multilingual NLP applications.

The paper tackled the problem of identifying hallucinations in multilingual question-answering systems by leveraging multiple Large Language Models with internal and external knowledge, achieving top ranking for Hindi data and Top-5 positions in seven other languages.

We present the system developed by the Central China Normal University (CCNU) team for the Mu-SHROOM shared task, which focuses on identifying hallucinations in question-answering systems across 14 different languages. Our approach leverages multiple Large Language Models (LLMs) with distinct areas of expertise, employing them in parallel to annotate hallucinations, effectively simulating a crowdsourcing annotation process. Furthermore, each LLM-based annotator integrates both internal and external knowledge related to the input during the annotation process. Using the open-source LLM DeepSeek-V3, our system achieves the top ranking (\#1) for Hindi data and secures a Top-5 position in seven other languages. In this paper, we also discuss unsuccessful approaches explored during our development process and share key insights gained from participating in this shared task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes