CL AINov 3, 2024

Rate, Explain and Cite (REC): Enhanced Explanation and Attribution in Automatic Evaluation by Large Language Models

Aliyah R. Hsu, James Zhu, Zhichao Wang, Bin Bi, Shubham Mehrotra, Shiva K. Pentyala, Katherine Tan, Xiang-Bo Mao, Roshanak Omrani, Sougata Chaudhuri, Regunathan Radhakrishnan, Sitaram Asur

arXiv:2411.02448v31.91 citationsh-index: 18Has Code

Originality Incremental advance

AI Analysis

This addresses the problem of ensuring text quality for users relying on LLM-generated content, though it is incremental as it builds on existing autoevaluation methods.

The paper tackles the challenge of evaluating text generated by large language models by introducing fine-tuned autoevaluators (REC-8B, REC-12B, REC-70B) that rate metrics like faithfulness and coherence, provide explanations, and offer verifiable citations, with REC-70B outperforming state-of-the-art LLMs in benchmarks.

LLMs have demonstrated impressive proficiency in generating coherent and high-quality text, making them valuable across a range of text-generation tasks. However, rigorous evaluation of this generated content is crucial, as ensuring its quality remains a significant challenge due to persistent issues such as factual inaccuracies and hallucination. This paper introduces three fine-tuned general-purpose LLM autoevaluators, REC-8B, REC-12B and REC-70B, specifically designed to evaluate generated text across several dimensions: faithfulness, instruction following, coherence, and completeness. These models not only provide ratings for these metrics but also offer detailed explanation and verifiable citation, thereby enhancing trust in the content. Moreover, the models support various citation modes, accommodating different requirements for latency and granularity. Extensive evaluations on diverse benchmarks demonstrate that our general-purpose LLM auto-evaluator, REC-70B, outperforms state-of-the-art LLMs, excelling in content evaluation by delivering better quality explanation and citation with minimal bias. Our REC dataset and models are available at https://github.com/adelaidehsu/REC.

View on arXiv PDF Code

Similar