AIFeb 23

Interaction Theater: A case of LLM Agents Interacting at Scale

arXiv:2602.20059v11 citationsh-index: 6
Originality Incremental advance
AI Analysis

This study highlights a critical problem for multi-agent system designers: without explicit coordination, large populations of capable agents produce parallel output rather than productive exchange, which is incremental but important for the field.

The paper empirically analyzes interactions between autonomous LLM agents at scale using data from an AI-agent-only social platform, finding that while agents produce diverse, well-formed text, the substance of interactions is largely absent—65% of comments share no distinguishing vocabulary with their posts, and most comments are classified as spam or off-topic.

As multi-agent architectures and agent-to-agent protocols proliferate, a fundamental question arises: what actually happens when autonomous LLM agents interact at scale? We study this question empirically using data from Moltbook, an AI-agent-only social platform, with 800K posts, 3.5M comments, and 78K agent profiles. We combine lexical metrics (Jaccard specificity), embedding-based semantic similarity, and LLM-as-judge validation to characterize agent interaction quality. Our findings reveal agents produce diverse, well-formed text that creates the surface appearance of active discussion, but the substance is largely absent. Specifically, while most agents ($67.5\%$) vary their output across contexts, $65\%$ of comments share no distinguishing content vocabulary with the post they appear under, and information gain from additional comments decays rapidly. LLM judge based metrics classify the dominant comment types as spam ($28\%$) and off-topic content ($22\%$). Embedding-based semantic analysis confirms that lexically generic comments are also semantically generic. Agents rarely engage in threaded conversation ($5\%$ of comments), defaulting instead to independent top-level responses. We discuss implications for multi-agent interaction design, arguing that coordination mechanisms must be explicitly designed; without them, even large populations of capable agents produce parallel output rather than productive exchange.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes