Vision-Guided Iterative Refinement for Frontend Code Generation
This addresses the need for more efficient and higher-quality automated code generation in web development, though it is incremental as it builds on existing multi-stage refinement approaches.
The paper tackled the problem of costly human-in-the-loop refinement in frontend web development code generation by introducing an automated critic-in-the-loop framework using a vision-language model to provide visual feedback, achieving up to a 17.8% performance increase over three refinement cycles.
Code generation with large language models often relies on multi-stage human-in-the-loop refinement, which is effective but very costly - particularly in domains such as frontend web development where the solution quality depends on rendered visual output. We present a fully automated critic-in-the-loop framework in which a vision-language model serves as a visual critic that provides structured feedback on rendered webpages to guide iterative refinement of generated code. Across real-world user requests from the WebDev Arena dataset, this approach yields consistent improvements in solution quality, achieving up to 17.8% increase in performance over three refinement cycles. Next, we investigate parameter-efficient fine-tuning using LoRA to understand whether the improvements provided by the critic can be internalized by the code-generating LLM. Fine-tuning achieves 25% of the gains from the best critic-in-the-loop solution without a significant increase in token counts. Our findings indicate that automated, VLM-based critique of frontend code generation leads to significantly higher quality solutions than can be achieved through a single LLM inference pass, and highlight the importance of iterative refinement for the complex visual outputs associated with web development.