Predicting Issue Types with seBERT
This work addresses the specific problem of issue type prediction for software engineering practitioners, but it is incremental as it applies an existing method (fine-tuning a pre-trained model) to a new dataset.
The authors tackled the problem of predicting issue types in software engineering by fine-tuning seBERT, a BERT-based model pre-trained on software engineering data, achieving an overall F1-score of 85.7%, which is a 4.1% improvement over the baseline fastText model.
Pre-trained transformer models are the current state-of-the-art for natural language models processing. seBERT is such a model, that was developed based on the BERT architecture, but trained from scratch with software engineering data. We fine-tuned this model for the NLBSE challenge for the task of issue type prediction. Our model dominates the baseline fastText for all three issue types in both recall and precisio} to achieve an overall F1-score of 85.7%, which is an increase of 4.1% over the baseline.