Credit-Budgeted ICPC-Style Coding: When Agents Must Pay for Every Decision
For developers of autonomous coding agents, this work shifts evaluation from isolated accuracy to cost-aware problem-solving, highlighting a critical gap in current agent architectures.
The paper introduces USACOArena, a credit-budgeted coding arena that forces agents to pay for every decision, revealing that frontier agents fail to optimally balance accuracy with resource constraints.
Current evaluations of autonomous coding agents assume an unrealistic, infinite-resource environment. However, real-world software engineering is a resource-bound competition. As we scale toward large agent swarms, ignoring compute and time costs risks catastrophic budget exhaustion. To shift the focus from isolated accuracy to cost-aware problem-solving, we introduce USACOArena, an interactive ACM-ICPC-style arena driven by a strict "credit" economy. Every generated token, local test, and elapsed second depletes a fixed budget, forcing agents to make strategic trade-offs. Our comprehensive profiling reveals that frontier single agents and swarms currently fail to optimally balance accuracy with these constraints, exhibiting divergent, path-dependent behaviors. Ultimately, USACOArena provides an essential dynamic training ground for developing highly efficient, resource-aware agent architectures.