AIGTFeb 25

Power and Limitations of Aggregation in Compound AI Systems

arXiv:2602.21556v12 citationsh-index: 16
Originality Incremental advance
AI Analysis

This work addresses the problem of designing more capable compound AI systems for researchers and practitioners, though it is incremental in characterizing aggregation's theoretical limits.

The paper investigates whether aggregating responses from multiple identical AI models can expand the set of outputs achievable by a system designer, within a principal-agent framework. It identifies three mechanisms for expansion and provides theoretical conditions and an empirical illustration for LLMs in a toy task.

When designing compound AI systems, a common approach is to query multiple copies of the same model and aggregate the responses to produce a synthesized output. Given the homogeneity of these models, this raises the question of whether aggregation unlocks access to a greater set of outputs than querying a single model. In this work, we investigate the power and limitations of aggregation within a stylized principal-agent framework. This framework models how the system designer can partially steer each agent's output through its reward function specification, but still faces limitations due to prompt engineering ability and model capabilities. Our analysis uncovers three natural mechanisms -- feasibility expansion, support expansion, and binding set contraction -- through which aggregation expands the set of outputs that are elicitable by the system designer. We prove that any aggregation operation must implement one of these mechanisms in order to be elicitability-expanding, and that strengthened versions of these mechanisms provide necessary and sufficient conditions that fully characterize elicitability-expansion. Finally, we provide an empirical illustration of our findings for LLMs deployed in a toy reference-generation task. Altogether, our results take a step towards characterizing when compound AI systems can overcome limitations in model capabilities and in prompt engineering.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes