AI SESep 3, 2025

app.build: A Production Framework for Scaling Agentic Prompt-to-App Generation with Environment Scaffolding

Evgenii Kniazev, Arseny Kravchenko, Igor Rekun, James Broadhead, Nikita Shamgunov, Pranav Sah, Pratik Nichite, Ivan Yamshchikov

arXiv:2509.03310v13.3h-index: 1Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of production-oriented agent systems for developers, offering an incremental improvement through structured environments and validation.

The paper tackles the problem of scaling reliable AI agents for application generation by introducing app.build, a framework that uses systematic validation and structured environments, achieving a 73.3% viability rate and enabling open-weights models to reach 80.8% of closed-model performance.

We present app.build (https://github.com/appdotbuild/agent/), an open-source framework that improves LLM-based application generation through systematic validation and structured environments. Our approach combines multi-layered validation pipelines, stack-specific orchestration, and model-agnostic architecture, implemented across three reference stacks. Through evaluation on 30 generation tasks, we demonstrate that comprehensive validation achieves 73.3% viability rate with 30% reaching perfect quality scores, while open-weights models achieve 80.8% of closed-model performance when provided structured environments. The open-source framework has been adopted by the community, with over 3,000 applications generated to date. This work demonstrates that scaling reliable AI agents requires scaling environments, not just models -- providing empirical insights and complete reference implementations for production-oriented agent systems.

View on arXiv PDF Code

Similar