MAAICLLGMay 30, 2025

An Adversary-Resistant Multi-Agent LLM System via Credibility Scoring

arXiv:2505.24239v14 citationsh-index: 3IJCNLP-AACL
Originality Incremental advance
AI Analysis

This addresses a security and reliability problem for users of multi-agent LLM systems, though it appears incremental as it builds on existing multi-agent frameworks.

The paper tackles the vulnerability of multi-agent LLM systems to adversarial and low-performing agents by introducing a credibility scoring framework that models collaboration as an iterative game, demonstrating effectiveness in mitigating adversarial influence across multiple tasks and settings.

While multi-agent LLM systems show strong capabilities in various domains, they are highly vulnerable to adversarial and low-performing agents. To resolve this issue, in this paper, we introduce a general and adversary-resistant multi-agent LLM framework based on credibility scoring. We model the collaborative query-answering process as an iterative game, where the agents communicate and contribute to a final system output. Our system associates a credibility score that is used when aggregating the team outputs. The credibility scores are learned gradually based on the past contributions of each agent in query answering. Our experiments across multiple tasks and settings demonstrate our system's effectiveness in mitigating adversarial influence and enhancing the resilience of multi-agent cooperation, even in the adversary-majority settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes