Software defect prediction is an active research area in software engineering. Accurate prediction of software defects assists software engineers in guiding software quality assurance activities to maximize utilization of testing resources, reduce maintenance cost and deliver quality software products. In the machine learning research, ensemble learning has been proven to improve the prediction performance over individual machine learning models. Recently, many boosting ensembles have been proposed in the literature, and their prediction capabilities were not investigated in defect prediction. In this paper, we will empirically investigate the prediction performance of Tree-based boosting ensembles in defect prediction, and they are: Ada boost, Random Forest, Extra Trees, Gradient Boosting, Hist Gradient Boosting, XGBoost and CatBoost. The study utilized 11 publicly available MDP NASA software defect datasets. Empirical results indicate the superiority of Random Forest and Extra Trees ensembles over other boosting ensembles. However, none of the boosting ensembles was significantly lower than individual decision trees in prediction performance. Finally, Ada boost ensemble was the worst performing ensemble among other ensembles.
Thu 5 Nov Times are displayed in time zone: (UTC) Coordinated Universal Time change
|16:00 - 16:20|
|Software Defect Prediction using Tree-Based Ensembles|
|16:20 - 16:40|
|Improving Real-World Vulnerability Characterization with Vulnerable Slices|