CLMar 6, 2025

SurveyForge: On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing

arXiv:2503.04629v137 citationsh-index: 26Has CodeACL
Originality Incremental advance
AI Analysis

This addresses the problem of inefficient and low-quality automated survey writing for researchers, though it appears incremental as it builds on existing LLM-based approaches.

The paper tackles the quality gap in LLM-generated survey papers by introducing SurveyForge, which improves outline generation and citation accuracy through logical analysis and memory-driven content refinement, outperforming prior methods like AutoSurvey in experiments.

Survey paper plays a crucial role in scientific research, especially given the rapid growth of research publications. Recently, researchers have begun using LLMs to automate survey generation for better efficiency. However, the quality gap between LLM-generated surveys and those written by human remains significant, particularly in terms of outline quality and citation accuracy. To close these gaps, we introduce SurveyForge, which first generates the outline by analyzing the logical structure of human-written outlines and referring to the retrieved domain-related articles. Subsequently, leveraging high-quality papers retrieved from memory by our scholar navigation agent, SurveyForge can automatically generate and refine the content of the generated article. Moreover, to achieve a comprehensive evaluation, we construct SurveyBench, which includes 100 human-written survey papers for win-rate comparison and assesses AI-generated survey papers across three dimensions: reference, outline, and content quality. Experiments demonstrate that SurveyForge can outperform previous works such as AutoSurvey.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes