Understanding Digits in Identifier Names: An Exploratory Study
This addresses the challenge of automated reasoning about identifier quality for software developers, though it is incremental as it builds on existing word-focused research.
The paper tackled the problem of understanding digits in identifier names by conducting an empirical study on 800 open-source Java systems, finding insights into how digits contribute to semantics and evolve over time.
Before any software maintenance can occur, developers must read the identifier names found in the code to be maintained. Thus, high-quality identifier names are essential for productive program comprehension and maintenance activities. With developers free to construct identifier names to their liking, it can be difficult to automatically reason about the quality and semantics behind an identifier name. Studying the structure of identifier names can help alleviate this problem. Existing research focuses on studying words within identifiers, but there are other symbols that appear in identifier names -- such as digits. This paper explores the presence and purpose of digits in identifier names through an empirical study of 800 open-source Java systems. We study how digits contribute to the semantics of identifier names and how identifier names that contain digits evolve over time through renaming. We envision our findings improving the efficiency of name appraisal and recommendation tools and techniques.