CLMar 22, 2024

Awakening Augmented Generation: Learning to Awaken Internal Knowledge of Large Language Models for Question Answering

Huanxuan Liao, Shizhu He, Yao Xu, Yuanzhe Zhang, Kang Liu, Shengping Liu, Jun Zhao

arXiv:2403.15268v513.220 citationsh-index: 30Has CodeCOLING

Originality Incremental advance

AI Analysis

This addresses the issue of inefficient knowledge activation in LLMs for question answering, offering a novel approach that reduces reliance on external data and costs, though it appears incremental by building on existing augmentation methods.

The paper tackles the problem of enhancing question answering with large language models by proposing Awakening-Augmented-Generation (AAG), a framework that awakens internal knowledge without external resources, achieving significant advantages in open-domain, closed-book, and out-of-distribution settings across three datasets.

Retrieval-Augmented-Generation and Generation-Augmented-Generation have been proposed to enhance the knowledge required for question answering with Large Language Models (LLMs) by leveraging richer context. However, the former relies on external resources, and both require incorporating explicit documents into the context, which increases execution costs and susceptibility to noise data during inference. Recent works indicate that LLMs model rich knowledge, but it is often not effectively activated and awakened. Inspired by this, we propose a novel knowledge-augmented framework, $\textbf{Awakening-Augmented-Generation}$ (AAG), which mimics the human ability to answer questions using only thinking and recalling to compensate for knowledge gaps, thereby awaking relevant knowledge in LLMs without relying on external resources. AAG consists of two key components for awakening richer context. Explicit awakening fine-tunes a context generator to create a synthetic, compressed document that functions as symbolic context. Implicit awakening utilizes a hypernetwork to generate adapters based on the question and synthetic document, which are inserted into LLMs to serve as parameter context. Experimental results on three datasets demonstrate that AAG exhibits significant advantages in both open-domain and closed-book settings, as well as in out-of-distribution generalization. Our code will be available at \url{https://github.com/Xnhyacinth/IAG}.

View on arXiv PDF Code

Similar