SEAIFeb 2

SWE-Universe: Scale Real-World Verifiable Environments to Millions

arXiv:2602.02361v13 citationsh-index: 27
Originality Incremental advance
AI Analysis

This work provides a critical resource and methodology to advance coding agents, addressing scalability and reliability challenges in software engineering environments, though it is incremental in improving existing automation techniques.

The authors tackled the problem of automatically constructing real-world software engineering verifiable environments from GitHub pull requests by proposing SWE-Universe, a scalable framework that uses a building agent with iterative self-verification and hacking detection, resulting in scaling to 807,693 environments and achieving a 75.3% score on SWE-Bench Verified with Qwen3-Max-Thinking.

We propose SWE-Universe, a scalable and efficient framework for automatically constructing real-world software engineering (SWE) verifiable environments from GitHub pull requests (PRs). To overcome the prevalent challenges of automatic building, such as low production yield, weak verifiers, and prohibitive cost, our framework utilizes a building agent powered by an efficient custom-trained model. This agent employs iterative self-verification and in-loop hacking detection to ensure the reliable generation of high-fidelity, verifiable tasks. Using this method, we scale the number of real-world multilingual SWE environments to a million scale (807,693). We demonstrate the profound value of our environments through large-scale agentic mid-training and reinforcement learning. Finally, we applied this technique to Qwen3-Max-Thinking and achieved a score of 75.3% on SWE-Bench Verified. Our work provides both a critical resource and a robust methodology to advance the next generation of coding agents.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes