SEAIOct 23, 2025

AgentArcEval: An Architecture Evaluation Method for Foundation Model based Agents

arXiv:2510.21031v11 citationsh-index: 20J Syst Softw
Originality Incremental advance
AI Analysis

This addresses the need for better architecture evaluation methods for foundation model-based agents, which is incremental as it builds on existing evaluation concepts but adapts them specifically for agents.

The paper tackles the problem of evaluating the architecture of foundation model-based agents, which traditional methods fail to address due to their unique characteristics, and presents AgentArcEval, a novel evaluation method demonstrated through a case study on a real-world tax copilot named Luna.

The emergence of foundation models (FMs) has enabled the development of highly capable and autonomous agents, unlocking new application opportunities across a wide range of domains. Evaluating the architecture of agents is particularly important as the architectural decisions significantly impact the quality attributes of agents given their unique characteristics, including compound architecture, autonomous and non-deterministic behaviour, and continuous evolution. However, these traditional methods fall short in addressing the evaluation needs of agent architecture due to the unique characteristics of these agents. Therefore, in this paper, we present AgentArcEval, a novel agent architecture evaluation method designed specially to address the complexities of FM-based agent architecture and its evaluation. Moreover, we present a catalogue of agent-specific general scenarios, which serves as a guide for generating concrete scenarios to design and evaluate the agent architecture. We demonstrate the usefulness of AgentArcEval and the catalogue through a case study on the architecture evaluation of a real-world tax copilot, named Luna.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes