SEJul 23, 2014

Mining developer communication data streams

arXiv:1407.6104v12 citations
Originality Synthesis-oriented
AI Analysis

This work addresses build outcome prediction for software developers, but it is incremental as it applies existing data stream mining techniques to a specific dataset.

The paper tackled the problem of predicting build outcomes in software development by analyzing developer communication data streams from the Jazz repository, finding that only a small number of metrics were significant for prediction.

This paper explores the concepts of modelling a software development project as a process that results in the creation of a continuous stream of data. In terms of the Jazz repository used in this research, one aspect of that stream of data would be developer communication. Such data can be used to create an evolving social network characterized by a range of metrics. This paper presents the application of data stream mining techniques to identify the most useful metrics for predicting build outcomes. Results are presented from applying the Hoeffding Tree classification method used in conjunction with the Adaptive Sliding Window (ADWIN) method for detecting concept drift. The results indicate that only a small number of the available metrics considered have any significance for predicting the outcome of a build.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes