학술논문

Authors’ Reply to “Comments on ‘Researcher Bias: The Use of Machine Learning in Software Defect Prediction’”
Document Type
Periodical
Source
IEEE Transactions on Software Engineering IIEEE Trans. Software Eng. Software Engineering, IEEE Transactions on. 44(11):1129-1131 Nov, 2018
Subject
Computing and Processing
Software
NASA
Measurement
Analysis of variance
Data models
Predictive models
Analytical models
Software quality assurance
defect prediction
researcher bias
Language
ISSN
0098-5589
1939-3520
2326-3881
Abstract
In 2014 we published a meta-analysis of software defect prediction studies [1] . This suggested that the most important factor in determining results was Research Group, i.e., who conducts the experiment is more important than the classifier algorithms being investigated. A recent re-analysis [2] sought to argue that the effect is less strong than originally claimed since there is a relationship between Research Group and Dataset. In this response we show (i) the re-analysis is based on a small (21 percent) subset of our original data, (ii) using the same re-analysis approach with a larger subset shows that Research Group is more important than type of Classifier and (iii) however the data are analysed there is compelling evidence that who conducts the research has an effect on the results. This means that the problem of researcher bias remains. Addressing it should be seen as a matter of priority amongst those of us who conduct and publish experiments comparing the performance of competing software defect prediction systems.