Xiangbing Huang

h-index13
2papers

2 Papers

SEDec 11, 2024
Unseen Horizons: Unveiling the Real Capability of LLM Code Generation Beyond the Familiar

Yuanliang Zhang, Yifan Xie, Shanshan Li et al.

Recently, large language models (LLMs) have shown strong potential in code generation tasks. However, there are still gaps before they can be fully applied in actual software development processes. Accurately assessing the code generation capabilities of large language models has become an important basis for evaluating and improving the models. Some existing works have constructed datasets to evaluate the capabilities of these models. However, the current evaluation process may encounter the illusion of "Specialist in Familiarity", primarily due to three gaps: the exposure of target code, case timeliness, and dependency availability. The fundamental reason for these gaps is that the code in current datasets may have been extensively exposed and exercised during the training phase, and due to the continuous training and development of LLM, their timeliness has been severely compromised. The key to solve the problem is to, as much as possible, evaluate the LLMs using code that they have not encountered before. Thus, the fundamental idea in this paper is to draw on the concept of code obfuscation, changing code at different levels while ensuring the functionality and output. To this end, we build a code-obfuscation based benchmark OBFUSEVAL. We first collect 1,354 raw cases from five real-world projects, including function description and code. Then we use three-level strategy (symbol, structure and semantic) to obfuscate descriptions, code and context dependencies. We evaluate four LLMs on OBFU- SEVAL and compared the effectiveness of different obfuscation strategy. We use official test suites of these projects to evaluate the generated code. The results show that after obfuscation, the average decrease ratio of test pass rate can up to 62.5%.

SESep 24, 2021
SEED: Semantic Graph based Deep detection for type-4 clone

Zhipeng Xue, Zhijie Jiang, Chenlin Huang et al.

Type-4 clones refer to a pair of code snippets with similar semantics but written in different syntax, which challenges the existing code clone detection techniques. Previous studies, however, highly rely on syntactic structures and textual tokens, which cannot precisely represent the semantic information of code and might introduce non-negligible noise into the detection models. To overcome these limitations, we design a novel semantic graph-based deep detection approach, called SEED. For a pair of code snippets, SEED constructs a semantic graph of each code snippet based on intermediate representation to represent the code semantic more precisely compared to the representations based on lexical and syntactic analysis. To accommodate the characteristics of Type-4 clones, a semantic graph is constructed focusing on the operators and API calls instead of all tokens. Then, SEED generates the feature vectors by using the graph match network and performs clone detection based on the similarity among the vectors. Extensive experiments show that our approach significantly outperforms two baseline approaches over two public datasets and one customized dataset. Especially, SEED outperforms other baseline methods by an average of 25.2% in the form of F1-Score. Our experiments demonstrate that SEED can reach state-of-the-art and be useful for Type-4 clone detection in practice.