AttentionRetriever: Attention Layers are Secretly Long Document Retrievers
This addresses the challenge of retrieving relevant information from long documents for LLMs, which is crucial for tasks like RAG, though it appears incremental as it builds on existing attention and retrieval methods.
The paper tackled the problem of long document retrieval for retrieval augmented generation by proposing AttentionRetriever, which uses attention mechanisms and entity-based retrieval to create context-aware embeddings and determine retrieval scope. It outperformed existing models on long document datasets by a large margin while maintaining efficiency comparable to dense retrieval models.
Retrieval augmented generation (RAG) has been widely adopted to help Large Language Models (LLMs) to process tasks involving long documents. However, existing retrieval models are not designed for long document retrieval and fail to address several key challenges of long document retrieval, including context-awareness, causal dependence, and scope of retrieval. In this paper, we proposed AttentionRetriever, a novel long document retrieval model that leverages attention mechanism and entity-based retrieval to build context-aware embeddings for long document and determine the scope of retrieval. With extensive experiments, we found AttentionRetriever is able to outperform existing retrieval models on long document retrieval datasets by a large margin while remaining as efficient as dense retrieval models.