Consensus is Strategically Insufficient: Reasoning-Trace Disagreement as a Knowledge-Representation Signal
For multi-agent systems in value-laden domains, this work provides a novel method to leverage disagreement as a signal for strategic decision-making, though it is an incremental extension of prior work on reasoning-trace disagreement.
The paper argues that consensus is insufficient for value-laden multi-agent tasks, proposing a knowledge-representation layer that categorizes reasoning-trace and decision agreement into four states to enable strategic routing. The framework is instantiated in content moderation, bridging LLM deliberation and symbolic reasoning.
Multi-agent systems are commonly designed to reduce disagreement through voting, consensus protocols, debate, or fault-tolerant aggregation. We argue that this objective is insufficient for value-laden tasks, where disagreement may reflect genuine normative uncertainty rather than agent error. Building on prior work on reasoning-trace disagreement in human-AI collaborative moderation, we propose a knowledge-representation layer in which reasoning traces and agent decisions are abstracted into symbolic disagreement states. Given agents producing explicit reasoning traces and binary decisions, we distinguish four states according to reasoning similarity and conclusion agreement: convergent agreement, divergent agreement, convergent disagreement and divergent disagreement. These states support defeasible strategic routing rules. We instantiate the framework in content moderation and argue that disagreement-aware routing provides a bridge between sub-symbolic LLM deliberation and symbolic knowledge representation for multi-agent strategic reasoning.