On Tracking Java Methods with Git Mechanisms
This work addresses a specific issue in software repository mining for researchers and developers, but it is incremental as it builds on prior techniques to enhance accuracy in method tracking.
The paper tackles the problem of accurately tracking Java methods at a fine-grained level using Git mechanisms, particularly when methods are renamed or moved, and demonstrates that their tool FinerGit improves tracking capability over the existing Historage technique, as shown by application to 182 open source projects with 1,768K methods.
Method-level historical information is useful in research on mining software repositories such as fault-prone module detection or evolutionary coupling identification. An existing technique named Historage converts a Git repository of a Java project to a finer-grained one. In a finer-grained repository, each Java method exists as a single file. Treating Java methods as files has an advantage, which is that Java methods can be tracked with Git mechanisms. The biggest benefit of tracking methods with Git mechanisms is that it can easily connect with any other tools and techniques build on Git infrastructure. However, Historage's tracking has an issue of accuracy, especially on small methods. More concretely, in the case that a small method is renamed or moved to another class, Historage has a limited capability to track the method. In this paper, we propose a new technique, FinerGit, to improve the trackability of Java methods with Git mechanisms. We implement FinerGit as a system and apply it to 182 open source software projects, which include 1,768K methods in total. The experimental results show that our tool has a higher capability of tracking methods in the case that methods are renamed or moved to other classes.