AI CLNov 12, 2025

Chain of Summaries: Summarization Through Iterative Questioning

arXiv:2511.15719v1h-index: 1

Originality Incremental advance

AI Analysis

This addresses the issue of LLM-unfriendly web formats for website maintainers and LLM users, offering an incremental improvement in summarization techniques.

The paper tackles the problem of making web content more digestible for Large Language Models (LLMs) by proposing Chain of Summaries (CoS), a method that iteratively refines summaries through questioning, resulting in up to 66% improvement over zero-shot LLM baselines and up to 27% over specialized summarization methods on datasets like TriviaQA.

Large Language Models (LLMs) are increasingly using external web content. However, much of this content is not easily digestible by LLMs due to LLM-unfriendly formats and limitations of context length. To address this issue, we propose a method for generating general-purpose, information-dense summaries that act as plain-text repositories of web content. Inspired by Hegel's dialectical method, our approach, denoted as Chain of Summaries (CoS), iteratively refines an initial summary (thesis) by identifying its limitations through questioning (antithesis), leading to a general-purpose summary (synthesis) that can satisfy current and anticipate future information needs. Experiments on the TriviaQA, TruthfulQA, and SQUAD datasets demonstrate that CoS outperforms zero-shot LLM baselines by up to 66% and specialized summarization methods such as BRIO and PEGASUS by up to 27%. CoS-generated summaries yield higher Q&A performance compared to the source content, while requiring substantially fewer tokens and being agnostic to the specific downstream LLM. CoS thus resembles an appealing option for website maintainers to make their content more accessible for LLMs, while retaining possibilities for human oversight.

View on arXiv PDF

Similar