Aspect-Guided Multi-Level Perturbation Analysis of Large Language Models in Automated Peer Review
This work addresses vulnerabilities in automated peer review systems, which is crucial for ensuring fairness and reliability in academic publishing, though it is incremental as it builds on existing evaluation methods.
The paper tackled the problem of evaluating the robustness of Large Language Models (LLMs) in automated peer review by proposing an aspect-guided, multi-level perturbation framework, and found that targeted perturbations in papers, reviews, and rebuttals introduce significant biases, such as incomplete rebuttals leading to higher acceptance rates, with these biases persisting under various prompting strategies.
We propose an aspect-guided, multi-level perturbation framework to evaluate the robustness of Large Language Models (LLMs) in automated peer review. Our framework explores perturbations in three key components of the peer review process-papers, reviews, and rebuttals-across several quality aspects, including contribution, soundness, presentation, tone, and completeness. By applying targeted perturbations and examining their effects on both LLM-as-Reviewer and LLM-as-Meta-Reviewer, we investigate how aspect-based manipulations, such as omitting methodological details from papers or altering reviewer conclusions, can introduce significant biases in the review process. We identify several potential vulnerabilities: review conclusions that recommend a strong reject may significantly influence meta-reviews, negative or misleading reviews may be wrongly interpreted as thorough, and incomplete or hostile rebuttals can unexpectedly lead to higher acceptance rates. Statistical tests show that these biases persist under various Chain-of-Thought prompting strategies, highlighting the lack of robust critical evaluation in current LLMs. Our framework offers a practical methodology for diagnosing these vulnerabilities, thereby contributing to the development of more reliable and robust automated reviewing systems.