AI MADec 23, 2025

MAR:Multi-Agent Reflexion Improves Reasoning Abilities in LLMs

Onat Ozer, Grace Wu, Yuchen Wang, Daniel Dosti, Honghao Zhang, Vivi De La Rue

arXiv:2512.20845v17 citations

Originality Incremental advance

AI Analysis

This addresses the issue of thought degeneration in LLMs for reasoning tasks, offering an incremental improvement over existing reflection methods.

The paper tackles the problem of LLMs repeating errors in reasoning tasks by introducing multi-agent debators to generate diverse reflections, achieving 47% EM on HotPot QA and 82.7% on HumanEval, outperforming single-LLM reflection.

LLMs have shown the capacity to improve their performance on reasoning tasks through reflecting on their mistakes, and acting with these reflections in mind. However, continual reflections of the same LLM onto itself exhibit degeneration of thought, where the LLM continues to repeat the same errors again and again even with the knowledge that its wrong. To address this problem, we instead introduce multi-agent with multi-persona debators as the method to generate reflections. Through out extensive experimentation, we've found that the leads to better diversity of in the reflections generated by the llm agent. We demonstrate an accuracy of 47% EM HotPot QA (question answering) and 82.7% on HumanEval (programming), both performances surpassing reflection with a single llm.

View on arXiv PDF

Similar