CLJan 8

GenProve: Learning to Generate Text with Fine-Grained Provenance

arXiv:2601.04932v11 citationsh-index: 10
Originality Highly original
AI Analysis

This addresses the issue of insufficient accountability in LLM-generated text for users who need to verify claims, though it is incremental as it builds on existing citation methods.

The paper tackles the problem of LLM hallucination by introducing a task for generating text with fine-grained provenance, and the result is that their GenProve framework outperforms 14 strong LLMs in joint evaluation of answer fidelity and provenance correctness.

Large language models (LLM) often hallucinate, and while adding citations is a common solution, it is frequently insufficient for accountability as users struggle to verify how a cited source supports a generated claim. Existing methods are typically coarse-grained and fail to distinguish between direct quotes and complex reasoning. In this paper, we introduce Generation-time Fine-grained Provenance, a task where models must generate fluent answers while simultaneously producing structured, sentence-level provenance triples. To enable this, we present ReFInE (Relation-aware Fine-grained Interpretability & Evidence), a dataset featuring expert verified annotations that distinguish between Quotation, Compression, and Inference. Building on ReFInE, we propose GenProve, a framework that combines Supervised Fine-Tuning (SFT) with Group Relative Policy Optimization (GRPO). By optimizing a composite reward for answer fidelity and provenance correctness, GenProve significantly outperforms 14 strong LLMs in joint evaluation. Crucially, our analysis uncovers a reasoning gap where models excel at surface-level quotation but struggle significantly with inference-based provenance, suggesting that verifiable reasoning remains a frontier challenge distinct from surface-level citation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes