AIMay 25

CODESKILL: Learning Self-Evolving Skills for Coding Agents

arXiv:2605.2543092.4
AI Analysis

For developers of coding agents, CODESKILL provides a method to automatically build and maintain a compact skill bank that boosts agent performance without manual prompt engineering.

CODESKILL introduces a learnable framework for extracting, evolving, and maintaining procedural skills from coding-agent trajectories, improving average pass rate by 9.69 over no-skill baseline and 4.01 over strongest prompt-based or memory baseline on three benchmarks.

Coding agents produce rich trajectories while solving software-engineering tasks. To enable agent self-evolution, these trajectories can be distilled into reusable procedural skills that compactly encode experience to guide future behavior. However, existing skill construction and maintenance methods often rely on fixed prompts and heuristic update rules, leaving it unclear how knowledge should be selected, abstracted, and maintained to best serve downstream agents. We propose CODESKILL, an LLM-based framework that reformulates skill extraction and skill-bank maintenance as a learnable management policy. CODESKILL extracts multi-granularity procedural skills from coding-agent trajectories, evolves skills with new experience, and maintains a compact skill bank for future task solving. We train CODESKILL with reinforcement learning, using a hybrid reward that combines dense rubric-based skill-quality feedback with sparse verifiable execution feedback from the frozen downstream agent. Experiments on EnvBench, SWE-Bench Verified, and Terminal-Bench 2 show that CODESKILL improves average pass rate by 9.69 over the no-skill baseline and by 4.01 over the strongest prompt-based or memory baseline, while maintaining the skill bank at a stable size during iterative construction.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes