SEApr 16, 2017

Automatic Bug Triage using Semi-Supervised Text Classification

arXiv:1704.04769v1153 citations
Originality Incremental advance
AI Analysis

This work addresses the issue of insufficient labeled data for bug triage in software engineering, offering an incremental improvement over prior supervised approaches.

The paper tackles the problem of bug triage by proposing a semi-supervised text classification approach that combines naive Bayes and expectation-maximization to utilize both labeled and unlabeled bug reports, achieving higher classification accuracy than existing supervised methods on Eclipse bug reports.

In this paper, we propose a semi-supervised text classification approach for bug triage to avoid the deficiency of labeled bug reports in existing supervised approaches. This new approach combines naive Bayes classifier and expectation-maximization to take advantage of both labeled and unlabeled bug reports. This approach trains a classifier with a fraction of labeled bug reports. Then the approach iteratively labels numerous unlabeled bug reports and trains a new classifier with labels of all the bug reports. We also employ a weighted recommendation list to boost the performance by imposing the weights of multiple developers in training the classifier. Experimental results on bug reports of Eclipse show that our new approach outperforms existing supervised approaches in terms of classification accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes