CLAILGFeb 27, 2024

Self-Refinement of Language Models from External Proxy Metrics Feedback

arXiv:2403.00827v19 citationsh-index: 43Has Code
Originality Incremental advance
AI Analysis

This addresses the challenge of multi-objective response generation for LLMs in document-grounded tasks, offering an incremental improvement through iterative refinement.

The paper tackles the problem of improving LLM responses by refining them along multiple quality dimensions using external proxy metrics, showing that self-refinement enhances response quality on document-grounded QA datasets and fine-tuning with synthetic data yields significant performance gains over baselines.

It is often desirable for Large Language Models (LLMs) to capture multiple objectives when providing a response. In document-grounded response generation, for example, agent responses are expected to be relevant to a user's query while also being grounded in a given document. In this paper, we introduce Proxy Metric-based Self-Refinement (ProMiSe), which enables an LLM to refine its own initial response along key dimensions of quality guided by external metrics feedback, yielding an overall better final response. ProMiSe leverages feedback on response quality through principle-specific proxy metrics, and iteratively refines its response one principle at a time. We apply ProMiSe to open source language models Flan-T5-XXL and Llama-2-13B-Chat, to evaluate its performance on document-grounded question answering datasets, MultiDoc2Dial and QuAC, demonstrating that self-refinement improves response quality. We further show that fine-tuning Llama-2-13B-Chat on the synthetic dialogue data generated by ProMiSe yields significant performance improvements over the zero-shot baseline as well as a supervised fine-tuned model on human annotated data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes