CL IRJun 2

Re-Ranking Through an Attribution Lens for Citation Quality in Legal QA

arXiv:2606.0372881.0h-index: 30

Predicted impact top 66% in CL · last 90 daysOriginality Incremental advance

AI Analysis

For legal QA systems, this work addresses the problem of retrieving passages that are actually cited by language models, improving citation faithfulness.

The authors show that semantic similarity ranking does not correlate with citation quality in legal QA, and train a lightweight cross-encoder on perturbation-based attribution scores to re-rank passages. The re-ranker substantially improves citation faithfulness and alignment with gold expert answers on the AQuAECHR benchmark.

Retrieval-augmented generation systems for legal question answering typically retrieve passages based on semantic similarity and provide them to a language model, which then generates cited answers. Prior work assumes that highly ranked passages are most likely to be usefully cited by the model. Perturbation-based attribution methods, such as C-LIME, have been used exclusively for post-hoc explanation. However, on the AQuAECHR benchmark, semantic similarity does not correlate with passage attribution. Within a retriever's candidate pool, similarity-based ranking performs worse than random selection at surfacing gold citation paragraphs. To address this limitation, a lightweight cross-encoder is trained on continuous perturbation-based attribution scores to re-rank passages prior to generation. This approach is evaluated on the AQuAECHR benchmark, using two language models and five-fold cross-validation. The re-ranker substantially improves citation faithfulness and alignment with gold expert answers. Notably, two re-rankers trained independently on different models converge beyond their raw attribution agreement. This finding indicates that the cross-encoder reduces model-specific noise and produces a shared relevance signal that partially transfers across models, although same-model re-ranking remains more effective. These results demonstrate that perturbation-based attribution provides a practical, model-agnostic training signal for citation-aware retrieval.

View on arXiv PDF

Similar