CL LGMay 27, 2023

Augmentation-Adapted Retriever Improves Generalization of Language Models as Generic Plug-In

Zichun Yu, Chenyan Xiong, Shi Yu, Zhiyuan Liu

arXiv:2305.17331v129.1261 citationsHas Code

Originality Highly original

AI Analysis

This enables a generic retrieval plug-in for various LMs, addressing the need for adaptable augmentation in knowledge-intensive tasks without incremental fine-tuning.

The paper tackles the problem of retrieval augmentation for language models (LMs) without requiring joint fine-tuning, proposing an augmentation-adapted retriever (AAR) that learns preferences from a source LM to improve zero-shot generalization of unseen target LMs. Experiments on MMLU and PopQA datasets show AAR significantly boosts performance, e.g., for LMs ranging from 250M Flan-T5 to 175B InstructGPT.

Retrieval augmentation can aid language models (LMs) in knowledge-intensive tasks by supplying them with external information. Prior works on retrieval augmentation usually jointly fine-tune the retriever and the LM, making them closely coupled. In this paper, we explore the scheme of generic retrieval plug-in: the retriever is to assist target LMs that may not be known beforehand or are unable to be fine-tuned together. To retrieve useful documents for unseen target LMs, we propose augmentation-adapted retriever (AAR), which learns LM's preferences obtained from a known source LM. Experiments on the MMLU and PopQA datasets demonstrate that our AAR trained with a small source LM is able to significantly improve the zero-shot generalization of larger target LMs ranging from 250M Flan-T5 to 175B InstructGPT. Further analysis indicates that the preferences of different LMs overlap, enabling AAR trained with a single source LM to serve as a generic plug-in for various target LMs. Our code is open-sourced at https://github.com/OpenMatch/Augmentation-Adapted-Retriever.

View on arXiv PDF Code

Similar