CLSep 17, 2024

Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

Maojia Song, Shang Hong Sim, Rishabh Bhardwaj, Hai Leong Chieu, Navonil Majumder, Soujanya Poria

arXiv:2409.11242v412.932 citationsh-index: 77Has Code

Originality Incremental advance

AI Analysis

This addresses the gap in understanding LLM appropriateness for RAG tasks, offering a method to enhance trustworthiness through alignment, though it appears incremental as it builds on existing RAG frameworks.

The paper tackles the problem of evaluating and improving the trustworthiness of LLMs in retrieval-augmented generation (RAG) systems by introducing Trust-Score, a holistic metric, and Trust-Align, a method to align LLMs for better performance. Results show Trust-Align substantially outperforms baselines on benchmarks like ASQA, QAMPARI, and ELI5, with improvements of up to 36.04 points.

LLMs are an integral component of retrieval-augmented generation (RAG) systems. While many studies focus on evaluating the overall quality of end-to-end RAG systems, there is a gap in understanding the appropriateness of LLMs for the RAG task. To address this, we introduce Trust-Score, a holistic metric that evaluates the trustworthiness of LLMs within the RAG framework. Our results show that various prompting methods, such as in-context learning, fail to effectively adapt LLMs to the RAG task as measured by Trust-Score. Consequently, we propose Trust-Align, a method to align LLMs for improved Trust-Score performance. 26 out of 27 models aligned using Trust-Align substantially outperform competitive baselines on ASQA, QAMPARI, and ELI5. Specifically, in LLaMA-3-8b, Trust-Align outperforms FRONT on ASQA (up 12.56), QAMPARI (up 36.04), and ELI5 (up 17.69). Trust-Align also significantly enhances models' ability to correctly refuse and provide quality citations. We also demonstrate the effectiveness of Trust-Align across different open-weight models, including the LLaMA series (1b to 8b), Qwen-2.5 series (0.5b to 7b), and Phi3.5 (3.8b). We release our code at https://github.com/declare-lab/trust-align.

View on arXiv PDF Code

Similar