CLJun 9, 2025

TreeReview: A Dynamic Tree of Questions Framework for Deep and Efficient LLM-based Scientific Peer Review

arXiv:2506.07642v325 citationsh-index: 26Has CodeEMNLP
Originality Incremental advance
AI Analysis

This addresses the challenge of automating scientific peer review for researchers and reviewers by offering a more efficient and insightful method, though it is incremental as it builds on existing LLM-based review approaches.

The paper tackles the problem of generating thorough and efficient peer reviews using LLMs by proposing TreeReview, a framework that models review as a hierarchical question-answering process, resulting in improved review quality and up to 80% reduction in token usage compared to baseline methods.

While Large Language Models (LLMs) have shown significant potential in assisting peer review, current methods often struggle to generate thorough and insightful reviews while maintaining efficiency. In this paper, we propose TreeReview, a novel framework that models paper review as a hierarchical and bidirectional question-answering process. TreeReview first constructs a tree of review questions by recursively decomposing high-level questions into fine-grained sub-questions and then resolves the question tree by iteratively aggregating answers from leaf to root to get the final review. Crucially, we incorporate a dynamic question expansion mechanism to enable deeper probing by generating follow-up questions when needed. We construct a benchmark derived from ICLR and NeurIPS venues to evaluate our method on full review generation and actionable feedback comments generation tasks. Experimental results of both LLM-based and human evaluation show that TreeReview outperforms strong baselines in providing comprehensive, in-depth, and expert-aligned review feedback, while reducing LLM token usage by up to 80% compared to computationally intensive approaches. Our code and benchmark dataset are available at https://github.com/YuanChang98/tree-review.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes