SE CLJul 28, 2025

Enhancing Project-Specific Code Completion by Inferring Internal API Information

Le Deng, Xiaoxue Ren, Chao Ni, Ming Liang, David Lo, Zhongxin Liu

arXiv:2507.20888v19 citationsh-index: 6IEEE Trans Softw Eng

Originality Incremental advance

AI Analysis

This work solves a critical bottleneck for developers using code completion tools, offering a significant but incremental enhancement over existing retrieval-augmented generation methods.

The paper tackles the problem of project-specific code completion by addressing the challenge of incorporating internal API information without explicit imports, resulting in improvements of 22.72% in code exact match and 18.31% in identifier exact match on benchmarks.

Project-specific code completion is a critical task that leverages context from a project to generate accurate code. State-of-the-art methods use retrieval-augmented generation (RAG) with large language models (LLMs) and project information for code completion. However, they often struggle to incorporate internal API information, which is crucial for accuracy, especially when APIs are not explicitly imported in the file. To address this, we propose a method to infer internal API information without relying on imports. Our method extends the representation of APIs by constructing usage examples and semantic descriptions, building a knowledge base for LLMs to generate relevant completions. We also introduce ProjBench, a benchmark that avoids leaked imports and consists of large-scale real-world projects. Experiments on ProjBench and CrossCodeEval show that our approach significantly outperforms existing methods, improving code exact match by 22.72% and identifier exact match by 18.31%. Additionally, integrating our method with existing baselines boosts code match by 47.80% and identifier match by 35.55%.

View on arXiv PDF

Similar