CLJun 19, 2024

ALiiCE: Evaluating Positional Fine-grained Citation Generation

Yilong Xu, Jinhua Gao, Xiaoming Yu, Baolong Bi, Huawei Shen, Xueqi Cheng

arXiv:2406.13375v312.916 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the need for better evaluation of citation generation in LLMs to enhance credibility, though it is incremental as it builds on existing citation generation research.

The authors tackled the problem of evaluating positional fine-grained citation generation in large language models, which goes beyond sentence-level citations, by proposing ALiiCE, an automatic evaluation framework that uses dependency trees and three metrics, and demonstrated its effectiveness on long-form QA datasets.

Large Language Model (LLM) can enhance its credibility and verifiability by generating text with citations. However, existing research on citation generation is predominantly limited to sentence-level statements, neglecting the significance of positional fine-grained citations that can appear anywhere within sentences. To facilitate further exploration of the positional fine-grained citation generation, we propose ALiiCE, the first automatic evaluation framework for this task. Our method employs a dependency tree based approach to parse the sentence-level claim into atomic claims. Then ALiiCE evaluates citation quality using three metrics, including positional fine-grained citation recall, precision, and coefficient of variation of citation positions. We evaluate the positional fine-grained citation generation performance of several LLMs on long-form QA datasets. Our experiments and analyses demonstrate the effectiveness and reasonableness of ALiiCE. We offer our insights into the current advancements and future directions for the positional fine-grained citation generation task.

View on arXiv PDF Code

Similar