Context-Efficient Retrieval with Factual Decomposition
This work addresses retrieval efficiency for LLM applications, but it is incremental as it builds on existing retrieval methods with a specific pre-processing technique.
The paper tackles the problem of inefficient retrieval in large language models by pre-processing external corpora into atomic facts, resulting in improved performance on question answering tasks when retrieval text is limited, which enhances inference efficiency.
There has recently been considerable interest in incorporating information retrieval into large language models (LLMs). Retrieval from a dynamically expanding external corpus of text allows a model to incorporate current events and can be viewed as a form of episodic memory. Here we demonstrate that pre-processing the external corpus into semi-structured ''atomic facts'' makes retrieval more efficient. More specifically, we demonstrate that our particular form of atomic facts improves performance on various question answering tasks when the amount of retrieved text is limited. Limiting the amount of retrieval reduces the size of the context and improves inference efficiency.