학술논문

SPARC: Statistical Performance Analysis With Relevance Conclusions
Document Type
Periodical
Source
IEEE Open Journal of the Computer Society IEEE Open J. Comput. Soc. Computer Society, IEEE Open Journal of the. 2:117-129 2021
Subject
Computing and Processing
Benchmark testing
Performance evaluation
Computer performance
Statistical analysis
Program processors
Social factors
Performance benchmarking
RISC-V
relevance testing
statistical analysis
Language
ISSN
2644-1268
Abstract
The performance of one computer relative to another is traditionally characterized through benchmarking, a practice occasionally deficient in statistical rigor. The performance is often trivialized through simplified measures, such as the approach of central tendency, but doing so risks a loss of perspective of the variability and non-determinism of modern computer systems. Authentic performance evaluations are derived from statistical methods that accurately interpret and assess data. Methods that currently exist within performance comparison frameworks are limited in efficacy, statistical inference is either overtly simplified or altogether avoided. A prevalent criticism from computer performance literature suggests that the results from difference hypothesis testing lack substance. To address this problem, we propose a new framework, SPARC, that pioneers a synthesis of difference and equivalence hypothesis testing to provide relevant conclusions. It is a union of three key components: (i) identifying either superiority or similarity through difference and equivalence hypotheses (ii) scalable methodology (based on the number of benchmarks), and (iii) a conditional feedback loop from test outcomes that produces informative conclusions of relevance, equivalence, trivial, or indeterminant. We present an experimental analysis characterizing the performance of a trio of RISC-V open-source processors to evaluate SPARC and its efficacy compared to similar frameworks.