SEAIDec 16, 2025

PerfCoder: Large Language Models for Interpretable Code Performance Optimization

arXiv:2512.14018v13 citationsh-index: 4
Originality Incremental advance
AI Analysis

This addresses a critical need for high-performance code generation in real-world software systems, representing a domain-specific advancement in AI for software engineering.

The paper tackles the problem of large language models (LLMs) generating low-performance code by introducing PerfCoder, a family of LLMs fine-tuned for interpretable code performance optimization, which achieves state-of-the-art results on the PIE benchmark with improved runtime speedup and optimization rates.

Large language models (LLMs) have achieved remarkable progress in automatic code generation, yet their ability to produce high-performance code remains limited--a critical requirement in real-world software systems. We argue that current LLMs struggle not only due to data scarcity but, more importantly, because they lack supervision that guides interpretable and effective performance improvements. In this work, we introduce PerfCoder, a family of LLMs specifically designed to generate performance-enhanced code from source code via interpretable, customized optimizations. PerfCoder is fine-tuned on a curated collection of real-world optimization trajectories with human-readable annotations, and preference-aligned by reinforcement fine-tuning using runtime measurements, enabling it to propose input-specific improvement strategies and apply them directly without relying on iterative refinement. On the PIE code performance benchmark, PerfCoder surpasses all existing models in both runtime speedup and effective optimization rate, demonstrating that performance optimization cannot be achieved by scale alone but requires optimization stratetgy awareness. In addition, PerfCoder can generate interpretable feedback about the source code, which, when provided as input to a larger LLM in a planner-and-optimizer cooperative workflow, can further improve outcomes. Specifically, we elevate the performance of 32B models and GPT-5 to new levels on code optimization, substantially surpassing their original performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes