CRAIJul 24, 2023

How Does Naming Affect LLMs on Code Analysis Tasks?

arXiv:2307.12488v57 citationsh-index: 78
Originality Synthesis-oriented
AI Analysis

It addresses a practical problem for developers and researchers using LLMs in software engineering, but the findings are incremental as they confirm an intuitive expectation.

This paper investigates how variable and function naming affects Large Language Models (LLMs) like CodeBERT and GPT on code analysis tasks, finding that nonsense or misleading names significantly reduce performance, indicating heavy reliance on well-defined names.

The Large Language Models (LLMs), such as GPT and BERT, were proposed for natural language processing (NLP) and have shown promising results as general-purpose language models. An increasing number of industry professionals and researchers are adopting LLMs for program analysis tasks. However, one significant difference between programming languages and natural languages is that a programmer has the flexibility to assign any names to variables, methods, and functions in the program, whereas a natural language writer does not. Intuitively, the quality of naming in a program affects the performance of LLMs in program analysis tasks. This paper investigates how naming affects LLMs on code analysis tasks. Specifically, we create a set of datasets with code containing nonsense or misleading names for variables, methods, and functions, respectively. We then use well-trained models (CodeBERT) to perform code analysis tasks on these datasets. The experimental results show that naming has a significant impact on the performance of code analysis tasks based on LLMs, indicating that code representation learning based on LLMs heavily relies on well-defined names in code. Additionally, we conduct a case study on some special code analysis tasks using GPT, providing further insights.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes