CLApr 1

Are Finer Citations Always Better? Rethinking Granularity for Attributed Generation

Hexuan Wang, Jingyu Zhang, Benjamin Van Durme, Daniel Khashabi

arXiv:2604.0143288.4h-index: 16

AI Analysis

This work addresses the problem of balancing human verification needs with model constraints in attributed generation for AI systems, revealing that common fine-grained citation practices can be counterproductive.

The study investigated how citation granularity affects model performance in attributed generation, finding that enforcing fine-grained citations degrades attribution quality by 16-276% compared to optimal intermediate granularities like paragraph-level, while preserving answer correctness.

Citation granularity - whether to cite individual sentences, paragraphs, or documents - is a critical design choice in attributed generation. While fine-grained citations are often preferred for precise human verification, their impact on model performance remains under-explored. We analyze four model scales (8B-120B) and demonstrate that enforcing fine-grained citations degrades attribution quality by 16-276% compared to the best-performing granularity. We observe a consistent performance pattern where attribution quality peaks at intermediate granularities (paragraph-level). Our analysis suggests that fine-grained (sentence-level) citations disrupt necessary semantic dependencies for attributing evidence to answer claims, while excessively coarse citations (multi-paragraph) introduce distracting noise. Importantly, the magnitude of this performance gap varies non-monotonically with model scale: fine-grained constraints disproportionately penalize larger models, suggesting that atomic citation units disrupt the multi-sentence information synthesis at which these models excel. Strikingly, citation-optimal granularity leads to substantial gains in attribution quality while preserving or even improving answer correctness. Overall, our findings demonstrate that optimizing solely for human verification via fine-grained citation disregards model constraints, compromising both attribution faithfulness and generation reliability. Instead, effective attribution requires aligning citation granularity with the model's natural semantic scope.

View on arXiv PDF

Similar