Structured Information Retrieval Strategies for Localising Software Changes
This addresses the problem of inefficient software maintenance for developers by providing a more comprehensive change localization tool, though it appears incremental as it builds on existing IR techniques with a novel application to predicting new file creation.
The authors tackled the problem of automatically localizing software changes for arbitrary change requests by proposing a generalized approach that identifies both files to modify and packages where new files should be created. Their SeekChanges method, using structured information retrieval strategies on Java projects, achieved satisfactory overall performance and good performance specifically for recommending newly created files.
During software maintenance and evolution, developers need to deal with a large number of change requests by modifying existing code or adding code into the system. An efficient tackling of change request calls for an accurate localising of software changes, i.e. identifying which code are problematic and where new files should be added for any type of change request at hand, such as a bug report or a feature request. Existing automatic techniques for this change localisation problem are limited in two aspects: on the one hand, they are only limited to tackle a specific type of change request; on the other hand, they are focused on finding files that should be modified for a change request, yet barely capable of recommending what files or packages might be newly created. To address the limitations, we are inspired to propose a generalised change localisation approach to identify the to-be-modified files (mostly for bugs), and at the same time point out where new files or packages should be created (mostly for new feature requests) for an arbitrary type of change request. In order to tackle the key challenge of predicting to-be-created program elements, our proposed SeekChanges approach leverages the hierarchical package structure for Java projects, and model the change localisation problem as a structured information retrieval (IR) task. A systematic investigation of three structured IR strategies is carried out for scoring and ranking both the files that should be modified and the software packages in which the new files should be created to address change requests. Extensive experiments on four open source Java projects from the Apache Software Foundation demonstrate that structured IR strategies have a good performance on recommending newly created files, while the overall performance of localising change requests is equally satisfactory.