SEPLJan 21, 2014

How are identifiers named in open source software? About popularity and consistency

arXiv:1401.5300v21 citationsHas Code
AI Analysis

This addresses the problem of software maintenance and coding standards for educators and managers, but it is incremental as it builds on existing naming convention studies.

The study investigated the popularity and consistency of identifier naming conventions in open source software, finding that Camel and Pascal conventions are most popular while Hungarian notation is declining, and Java projects show better consistency than C/C++ projects.

With the rapid increasing of software project size and maintenance cost, adherence to coding standards especially by managing identifier naming, is attracting a pressing concern from both computer science educators and software managers. Software developers mainly use identifier names to represent the knowledge recorded in source code. However, the popularity and adoption consistency of identifier naming conventions have not been revealed yet in this field. Taking forty-eight popular open source projects written in three top-ranking programming languages Java, C and C++ as examples, an identifier extraction tool based on regular expression matching is developed. In the subsequent investigation, some interesting findings are obtained. For the identifier naming popularity, it is found that Camel and Pascal naming conventions are leading the road while Hungarian notation is vanishing. For the identifier naming consistency, we have found that the projects written in Java have a much better performance than those written in C and C++. Finally, academia and software industry are urged to adopt the most popular naming conventions consistently in their practices so as to lead the identifier naming to a standard, unified and high-quality road.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes