AIOct 7, 2025

Belief-Calibrated Multi-Agent Consensus Seeking for Complex NLP Tasks

arXiv:2510.06307v1h-index: 39Has Code
Originality Highly original
AI Analysis

This work addresses consensus instability in multi-agent NLP systems, offering a novel method for improving collaboration and performance on challenging tasks, though it is incremental as it builds on existing consensus-seeking approaches.

The paper tackles the problem of unstable consensus in multi-agent systems for complex NLP tasks by proposing a framework that selects optimal collaborators and calibrates consensus judgment using system-internal beliefs, resulting in accuracy improvements of 2.23% on MATH and 3.95% on MMLU benchmarks compared to existing methods.

A multi-agent system (MAS) enhances its capacity to solve complex natural language processing (NLP) tasks through collaboration among multiple agents, where consensus-seeking serves as a fundamental mechanism. However, existing consensus-seeking approaches typically rely on voting mechanisms to judge consensus, overlooking contradictions in system-internal beliefs that destabilize the consensus. Moreover, these methods often involve agents updating their results through indiscriminate collaboration with every other agent. Such uniform interaction fails to identify the optimal collaborators for each agent, hindering the emergence of a stable consensus. To address these challenges, we provide a theoretical framework for selecting optimal collaborators that maximize consensus stability. Based on the theorems, we propose the Belief-Calibrated Consensus Seeking (BCCS) framework to facilitate stable consensus via selecting optimal collaborators and calibrating the consensus judgment by system-internal beliefs. Experimental results on the MATH and MMLU benchmark datasets demonstrate that the proposed BCCS framework outperforms the best existing results by 2.23% and 3.95% of accuracy on challenging tasks, respectively. Our code and data are available at https://github.com/dengwentao99/BCCS.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes