AISIMay 27

CyberJurors: A Multi-Agent Simulation Task for E-Commerce Disputes Verdict

arXiv:2605.2836986.1Has Code
Predicted impact top 26% in AI · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the need for automated verdict systems in e-commerce dispute resolution, a domain with unique challenges not handled by existing methods.

The authors introduce the E-commerce Dispute Verdicts (EDV) task and VerdictBench, a multimodal benchmark of 6,000 real-world cases, and propose CyberJurors, a multi-agent framework that outperforms state-of-the-art LLMs, MLLMs, and court simulators in aligning with real-world jury voting patterns.

E-commerce platforms have begun recruiting crowdsourced jurors to adjudicate massive volumes of transaction disputes. Unlike formal legal judgment, E-commerce dispute verdicts require grounding pivotal clues from redundant, multi-round, multimodal evidence and making decisions under flexible platform-specific conventions. These characteristics render existing methods insufficient for this scenario. To bridge this gap, we introduce a pioneering task, E-commerce Dispute Verdicts (EDV), and present VerdictBench, a multimodal benchmark comprising 6,000 real-world cases designed to reflect crowdsourced jury decisions. Building upon this, we propose CyberJurors, a multi-agent framework to clarify the dispute logic and regulate the verdict process. At the individual level, Individual Verdict Chain-of-Thought decomposes the EDV task into four structured reasoning stages, enabling fine-grained clue perception and clarifying causal logic between pivotal clues and the dispute focus. At the collective level, Jury Consensus Verdict simulates multi-round discussion and voting among jurors, while incorporating verdict precedents to mitigate cognitive biases toward either disputant. Experiments on VerdictBench show that CyberJurors outperforms state-of-the-art LLMs, MLLMs, and court simulators, while achieving stronger alignment with real-world jury voting patterns. Code and dataset are available at https://github.com/YanhuiS/CyberJurors and https://huggingface.co/datasets/piggi/VerdictBench.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes