학술논문

A statistics-based performance testing methodology: a case study for the I/O bound tasks
Document Type
Conference
Source
2022 IEEE 17th International Conference on Computer Sciences and Information Technologies (CSIT) Computer Sciences and Information Technologies (CSIT), 2022 IEEE 17th International Conference on. :486-489 Nov, 2022
Subject
Communication, Networking and Broadcast Technologies
Computing and Processing
Signal Processing and Analysis
Data analysis
Statistical analysis
Programming
Time measurement
Reliability
Noise measurement
Task analysis
Web crawlers
parallel and concurrent programming
benchmarking
multithreading
epoll
Wilcoxon signed-Rank Test
Student t-test
Bonferroni correction
Language
ISSN
2766-3639
Abstract
This research demonstrates the applicability of the established statistical methodology developed for experimental data analysis within physics, biology, and other natural science disciplines to the analysis of computing performance measurements. The methodology is based on the randomization of test samples and test conditions as well as usage of statistical hypothesis testing to determine if the performance differs. The hypothesis testing method is selected based on the statistical distribution of the data-Student’s t-test for the normal distribution and non-parametric tests, such as the Wilcoxon signed-rank test, for other distributions. Multiple comparisons problem is taken into account using the Bonferroni correction. Though other important natural science methods exist-different kinds of blinded experiments and analysis could be beneficial too-they could be too expensive for such contexts without additional justifications. The method will be applied to a model problem: a comparison of the performance of three implementations of web crawlers, compiled with different options and different compilers. Experimental cases were intentionally selected with both large and minor (but definite) performance differences to show how statistical methods can help determine the appearance of the performance difference and confidently discuss its extent to take into account benefits or drawbacks of the various setups.