Dewu Zheng

h-index20
2papers

2 Papers

SEDec 24, 2024Code
Top General Performance = Top Domain Performance? DomainCodeBench: A Multi-domain Code Generation Benchmark

Dewu Zheng, Yanlin Wang, Ensheng Shi et al.

With the rapid advancement of large language models (LLMs), extensive research has been conducted to investigate the code generation capabilities of LLMs. However, existing efforts primarily focus on general-domain tasks, leaving LLMs' code generation performance in real-world application domains underexplored. This raises a critical question: can a model's general-domain coding ability reliably represent its ability in specialized domains? In this paper, we introduce DomainCodeBench, a multi-domain code generation benchmark designed to systematically evaluate LLMs across 12 software application domains and 15 programming languages. DomainCodeBench contains 2,400 manually verified tasks with ground truth, human-annotated docstrings, and fine-grained dependency information to ensure more coverage of domain-specific challenges. Specifically, we first identify the most popular application domains by topic mining. Then, we curate coding tasks based on commonly used frameworks and platforms in each domain. We obtain several findings through extensive experiments on DomainCodeBench with ten mainstream LLMs. (1) Performance decoupling: experiments reveal that top general-domain models do not consistently excel in specific application domains; (2) Domain-specific weaknesses: LLMs often fail due to domain knowledge gaps and third-party library misusage; (3) Contextual enhancement: we show that augmenting prompts with domain-specific knowledge improves performance by around 38.17%, providing actionable insights for performance optimization. Our replication package, including the benchmark, source code, and experimental results, is available at https://github.com/DeepSoftwareAnalytics/DomainCodeBench.

SEApr 11, 2025
Towards an Understanding of Context Utilization in Code Intelligence

Yanlin Wang, Kefeng Duan, Dewu Zheng et al.

Code intelligence is an emerging domain in software engineering, aiming to improve the effectiveness and efficiency of various code-related tasks. Recent research suggests that incorporating contextual information beyond the basic original task inputs (i.e., source code) can substantially enhance model performance. Such contextual signals may be obtained directly or indirectly from sources such as API documentation or intermediate representations like abstract syntax trees can significantly improve the effectiveness of code intelligence. Despite growing academic interest, there is a lack of systematic analysis of context in code intelligence. To address this gap, we conduct an extensive literature review of 146 relevant studies published between September 2007 and August 2024. Our investigation yields four main contributions. (1) A quantitative analysis of the research landscape, including publication trends, venues, and the explored domains; (2) A novel taxonomy of context types used in code intelligence; (3) A task-oriented analysis investigating context integration strategies across diverse code intelligence tasks; (4) A critical evaluation of evaluation methodologies for context-aware methods. Based on these findings, we identify fundamental challenges in context utilization in current code intelligence systems and propose a research roadmap that outlines key opportunities for future research.