Towards Automatic Generation of Short Summaries of Commits
This work addresses the challenge of improving commit message quality for software developers, but it is incremental as it focuses on verb generation and seeks community feedback before completing the approach.
The paper tackles the problem of automatically generating short commit summaries by proposing a 'verb+object' format, inspired by the observation that 82% of human-written commit messages are single sentences starting with a verb followed by an object, and it presents a classifier for verb generation as an initial step.
Committing to a version control system means submitting a software change to the system. Each commit can have a message to describe the submission. Several approaches have been proposed to automatically generate the content of such messages. However, the quality of the automatically generated messages falls far short of what humans write. In studying the differences between auto-generated and human-written messages, we found that 82% of the human-written messages have only one sentence, while the automatically generated messages often have multiple lines. Furthermore, we found that the commit messages often begin with a verb followed by an direct object. This finding inspired us to use a "verb+object" format in this paper to generate short commit summaries. We split the approach into two parts: verb generation and object generation. As our first try, we trained a classifier to classify a diff to a verb. We are seeking feedback from the community before we continue to work on generating direct objects for the commits.