CLSep 25, 2024

Enhancing Post-Hoc Attributions in Long Document Comprehension via Coarse Grained Answer Decomposition

Pritika Ramu, Koustava Goswami, Apoorv Saxena, Balaji Vasan Srinivasan

arXiv:2409.17073v415.425 citationsh-index: 11

Originality Incremental advance

AI Analysis

This addresses the challenge of reliable attribution in long document comprehension, which is crucial for developing trustworthy question-answering systems, though it appears incremental as it builds on existing post-hoc attribution methods.

The paper tackles the problem of attributing answers to source documents in long document question-answering by proposing a novel approach for factual decomposition of answers using template-based in-context learning with negative sampling, which enhances semantic understanding for both abstractive and extractive answers.

Accurately attributing answer text to its source document is crucial for developing a reliable question-answering system. However, attribution for long documents remains largely unexplored. Post-hoc attribution systems are designed to map answer text back to the source document, yet the granularity of this mapping has not been addressed. Furthermore, a critical question arises: What exactly should be attributed? This involves identifying the specific information units within an answer that require grounding. In this paper, we propose and investigate a novel approach to the factual decomposition of generated answers for attribution, employing template-based in-context learning. To accomplish this, we utilize the question and integrate negative sampling during few-shot in-context learning for decomposition. This approach enhances the semantic understanding of both abstractive and extractive answers. We examine the impact of answer decomposition by providing a thorough examination of various attribution approaches, ranging from retrieval-based techniques to LLM-based attributors.

View on arXiv PDF

Similar