SEApr 21

Cascaded Code Editing: Large-Small Model Collaboration for Effective and Efficient Code Editing

Chaozheng Wang, Zezhou Yang, Shuzheng Gao, Cuiyun Gao, Zongjie Li, Yichen Li, Ting Peng, Hailiang Huang, Yuetang Deng, Michael R. Lyu

arXiv:2604.1920186.0h-index: 5

AI Analysis

For developers and practitioners using LLMs for code editing, this work offers a practical method to reduce computational cost without sacrificing quality, though the approach is incremental and builds on existing model cascading techniques.

The paper tackles the inefficiency of large language models (LLMs) in code editing, which generate entire files despite minimal changes. By decomposing the task into edit sketch generation (by a large model) and sketch application (by a small model), they achieve a 2.5x speedup while maintaining competitive accuracy on the CodeEditorBench benchmark.

Code editing constitutes a fundamental practice in software development, wherein developers modify existing codebases according to natural language requirements. Accurate code editing necessitates a comprehensive understanding of both the existing codebase and the modification requirements. Although large language models (LLMs) have demonstrated promising performance in code editing tasks, they suffer from substantial inefficiency by generating entire modified files that largely consist of unchanged code. While smaller models could potentially address this inefficiency, they typically lack the capacity to effectively comprehend long code contexts required for accurate editing. To ensure both effectiveness and efficiency, we propose to decompose code editing into a two-stage cascade: \textbf{edit sketch generation}, wherein a large model first produces concise sketches representing the requisite modifications (the more challenging phase), and \textbf{edit sketch application}, wherein a smaller model integrates these sketches into the original code to produce the final output edited code (the simpler phase). This cascaded design reduces the number of tokens generated by the large model, as the majority of the output is handled by the smaller, more efficient model, thereby enhancing overall efficiency. However, the effectiveness of this approach is constrained by current small models' limited capabilities in handling long-context scenarios and cross-file dependencies, which are essential for accurate sketch application in real-world codebases. To address these limitations and enhance smaller models' sketch application capabilities, ...

View on arXiv PDF

Similar