CLApr 8

DTCRS: Dynamic Tree Construction for Recursive Summarization

arXiv:2604.0701289.55 citations
AI Analysis

This work addresses efficiency and relevance issues in retrieval-augmented generation for question answering, though it appears incremental as it builds on existing recursive summarization techniques.

The paper tackles the problem of redundant summary nodes in recursive summarization for question answering by introducing DTCRS, a method that dynamically generates summary trees based on document structure and query semantics, which significantly reduces construction time and improves performance across three QA tasks.

Retrieval-Augmented Generation (RAG) mitigates the hallucination problem of Large Language Models (LLMs) by incorporating external knowledge. Recursive summarization constructs a hierarchical summary tree by clustering text chunks, integrating information from multiple parts of a document to provide evidence for abstractive questions involving multi-step reasoning. However, summary trees often contain a large number of redundant summary nodes, which not only increase construction time but may also negatively impact question answering. Moreover, recursive summarization is not suitable for all types of questions. We introduce DTCRS, a method that dynamically generates summary trees based on document structure and query semantics. DTCRS determines whether a summary tree is necessary by analyzing the question type. It then decomposes the question and uses the embeddings of sub-questions as initial cluster centers, reducing redundant summaries while improving the relevance between summaries and the question. Our approach significantly reduces summary tree construction time and achieves substantial improvements across three QA tasks. Additionally, we investigate the applicability of recursive summarization to different question types, providing valuable insights for future research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes