CLFeb 15, 2025

CiteCheck: Towards Accurate Citation Faithfulness Detection

Peking U
arXiv:2502.10881v12 citationsh-index: 26Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific problem for Chinese retrieval-augmented generation systems by providing a challenging dataset, though it is incremental as it builds on existing detection tasks.

The authors tackled the scarcity of large-scale Chinese datasets for citation faithfulness detection by introducing CiteCheck, a dataset constructed via a cost-effective two-stage manual annotation method that balances positive and negative samples, enabling smaller models to achieve strong performance with parameter-efficient fine-tuning.

Citation faithfulness detection is critical for enhancing retrieval-augmented generation (RAG) systems, yet large-scale Chinese datasets for this task are scarce. Existing methods face prohibitive costs due to the need for manually annotated negative samples. To address this, we introduce the first large-scale Chinese dataset CiteCheck for citation faithfulness detection, constructed via a cost-effective approach using two-stage manual annotation. This method balances positive and negative samples while significantly reducing annotation expenses. CiteCheck comprises training and test splits. Experiments demonstrate that: (1) the test samples are highly challenging, with even state-of-the-art LLMs failing to achieve high accuracy; and (2) training data augmented with LLM-generated negative samples enables smaller models to attain strong performance using parameter-efficient fine-tuning. CiteCheck provides a robust foundation for advancing citation faithfulness detection in Chinese RAG systems. The dataset is publicly available to facilitate research.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes