CLAILGApr 17

AdaExplore: Failure-Driven Adaptation and Diversity-Preserving Search for Efficient Kernel Generation

CMU
arXiv:2604.1662595.13 citationsh-index: 15
AI Analysis

For developers using domain-specific languages like Triton, AdaExplore enables self-improving code generation without fine-tuning, addressing the lack of reusable knowledge in current LLM agents.

AdaExplore is an agent framework that uses failure-driven adaptation and diversity-preserving search to generate efficient Triton kernels, achieving 3.12x and 1.72x speedups on KernelBench Level-2 and Level-3 within 100 steps.

Recent large language model (LLM) agents have shown promise in using execution feedback for test-time adaptation. However, robust self-improvement remains far from solved: most approaches still treat each problem instance independently, without accumulating reusable knowledge. This limitation is particularly pronounced in domain-specific languages such as Triton, which are underrepresented in LLM pretraining data. Their strict constraints and non-linear optimization landscape further make naive generation and local refinement unreliable. We propose AdaExplore, an agent framework that enables self-improvement via accumulated execution feedback for performance-critical kernel code generation through two complementary stages: failure-driven adaptation and diversity-preserving search, jointly improving correctness and optimization performance without additional fine-tuning or external knowledge. In the adaptation stage, the agent synthesizes tasks and converts recurring failures into a reusable memory of validity rules, helping subsequent generations remain within the feasible set. In the search stage, the agent organizes candidate kernels as a tree and alternates between small local refinements and larger structural regeneration, allowing it to explore the optimization landscape beyond local optima. Experiments on kernel runtime optimization benchmarks validate these gains: AdaExplore achieves 3.12x and 1.72x speedups on KernelBench Level-2 and Level-3, respectively, within 100 steps, and continues to improve with additional computation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes