SEERA: A software cost estimation dataset for constrained environments
The accuracy of software cost estimation depends on the relevancy of the cost estimation dataset, the quality of its data and its suitability for the targeted software development environment. Software development cost is impacted by technical, socio-economic and country-specific organizational and cultural environments. Current publicly available software cost estimation datasets represent environments of North America and Europe, thus limiting their application in technically and economically constrained software industries. In this paper we introduce the SEERA (Software enginEERing in SudAn) cost estimation dataset, a dataset of 120 software development projects representing 42 organizations in Sudan. The SEERA dataset contains 76 attributes and, unlike current cost estimation datasets, is augmented with metadata and the original raw data. This paper describes the data collection process, submitting organizations and project characteristics. In addition, we give a general analysis of the dataset projects to illustrate the impact of local factors on software project cost and compare the data quality of the SEERA dataset to public datasets from the PROMISE repository. The SEERA dataset fills a gap in the diversity of current cost estimation datasets and provides researchers with an opportunity to evaluate the generalization of previous and future cost estimation methods to constrained environments and to develop new techniques that are more suitable for these environments.
Fri 6 NovDisplayed time zone: (UTC) Coordinated Universal Time change
17:05 - 17:45 | |||
17:05 20mTalk | SEERA: A software cost estimation dataset for constrained environments PROMISE 2020 | ||
17:25 20mTalk | An Exploratory Study on Applicability of Cross Project Defect Prediction Approaches to Cross-Company Effort Estimation PROMISE 2020 Sousuke Amasaki Okayama Prefectural University, Hirohisa Aman Ehime University, Tomoyuki Yokogawa Okayama Prefectural University |