CLJun 18, 2024

LightPAL: Lightweight Passage Retrieval for Open Domain Multi-Document Summarization

arXiv:2406.12494v21 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of high latency in iterative retrieval for open-ended queries in summarization, offering a more practical solution for users needing efficient information synthesis.

The paper tackles the problem of inefficient retrieval for open-domain multi-document summarization by proposing LightPAL, a lightweight method that uses a pre-constructed graph and random walk to avoid iterative LLM inference, resulting in improved retrieval and summarization metrics with higher efficiency.

Open-Domain Multi-Document Summarization (ODMDS) is the task of generating summaries from large document collections in response to user queries. This task is crucial for efficiently addressing diverse information needs from users. Traditional retrieve-then-summarize approaches fall short for open-ended queries in ODMDS tasks. These queries often require broader context than initially retrieved passages provide, making it challenging to retrieve all relevant information in a single search. While iterative retrieval methods has been explored for multi-hop question answering (MQA), it's impractical for ODMDS due to high latency from repeated LLM inference. Accordingly, we propose LightPAL, a lightweight passage retrieval method for ODMDS. LightPAL leverages an LLM to pre-construct a graph representing passage relationships, then employs random walk during retrieval, avoiding iterative LLM inference. Experiments demonstrate that LightPAL outperforms naive sparse and pre-trained dense retrievers in both retrieval and summarization metrics, while achieving higher efficiency compared to iterative MQA approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes