CLSep 12, 2023

Cited Text Spans for Citation Text Generation

arXiv:2309.06365v29 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses the problem of generating accurate and grounded citation texts for researchers and automated systems, representing an incremental improvement by introducing a novel input method to reduce hallucinations.

The paper tackled the problem of generating citation texts that accurately describe relationships between scientific articles by addressing the issue of non-factual hallucinations in existing abstractive approaches that rely only on cited paper abstracts. The result was a proposed method using cited text spans (CTS) with distant labeling and human-in-the-loop retrieval, achieving sufficiently strong performance to substitute for human annotations.

An automatic citation generation system aims to concisely and accurately describe the relationship between two scientific articles. To do so, such a system must ground its outputs to the content of the cited paper to avoid non-factual hallucinations. Due to the length of scientific documents, existing abstractive approaches have conditioned only on cited paper abstracts. We demonstrate empirically that the abstract is not always the most appropriate input for citation generation and that models trained in this way learn to hallucinate. We propose to condition instead on the cited text span (CTS) as an alternative to the abstract. Because manual CTS annotation is extremely time- and labor-intensive, we experiment with distant labeling of candidate CTS sentences, achieving sufficiently strong performance to substitute for expensive human annotations in model training, and we propose a human-in-the-loop, keyword-based CTS retrieval approach that makes generating citation texts grounded in the full text of cited papers both promising and practical.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes