Abstract
Nowadays, bugs have been common in most software systems. For large-scale software projects, developers usually conduct software maintenance tasks by utilizing software artifacts (e.g., bug reports). The severity of bug reports describes the impact of the bugs and determines how quickly it needs to be fixed. Bug triagers often pay close attention to some features such as severity to determine the importance of bug reports and assign them to the correct developers. However, a large number of bug reports submitted every day increase the workload of developers who have to spend more time on fixing bugs. In this paper, we collect question-and-answer pairs from Stack Overflow and use logical regression to predict the severity of bug reports. In detail, we extract all the posts related to bug repositories from Stack Overflow and combine them with bug reports to obtain enhanced versions of bug reports. We achieve severity prediction on three popular open source projects (e,g., Mozilla, Ecplise, and GCC) with Naïve Bayesian, k-Nearest Neighbor algorithm (KNN), and Long Short-Term Memory (LSTM). The results of our experiments show that our model is more accurate than the previous studies for predicting the severity. Our approach improves by 23.03%, 21.86%, and 20.59% of the average F-measure for Mozilla, Eclipse, and GCC by comparing with the Naïve Bayesian based approach which performs the best among all baseline approaches.
Original language | English |
---|---|
Article number | 110567 |
Pages (from-to) | 1-14 |
Journal | Journal of Systems and Software |
Volume | 165 |
DOIs | |
Publication status | Published - Jul 2020 |
Keywords
- Bug reports
- Logistic regression
- Severity prediction
- Stack overflow
ASJC Scopus subject areas
- Software
- Information Systems
- Hardware and Architecture