CLMar 4, 2024

WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with Citations

arXiv:2403.01774v226 citationsh-index: 4Has CodeACL
AI Analysis

This work addresses the need for better attribution in LLMs for Chinese web search applications, though it is incremental as it builds on existing attribution tasks with a new dataset and metrics.

The authors tackled the problem of attribution in large language models by creating WebCiteS, a Chinese dataset of 7k human-annotated summaries with citations from real-world web search results, and developed detailed metrics for fine-grained evaluation, revealing that LLMs struggle with correct citation.

Enhancing the attribution in large language models (LLMs) is a crucial task. One feasible approach is to enable LLMs to cite external sources that support their generations. However, existing datasets and evaluation methods in this domain still exhibit notable limitations. In this work, we formulate the task of attributed query-focused summarization (AQFS) and present WebCiteS, a Chinese dataset featuring 7k human-annotated summaries with citations. WebCiteS derives from real-world user queries and web search results, offering a valuable resource for model training and evaluation. Prior works in attribution evaluation do not differentiate between groundedness errors and citation errors. They also fall short in automatically verifying sentences that draw partial support from multiple sources. We tackle these issues by developing detailed metrics and enabling the automatic evaluator to decompose the sentences into sub-claims for fine-grained verification. Our comprehensive evaluation of both open-source and proprietary models on WebCiteS highlights the challenge LLMs face in correctly citing sources, underscoring the necessity for further improvement. The dataset and code will be open-sourced to facilitate further research in this crucial field.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes