학술논문
Analysis of branch misses in Quicksort.
Document Type
Proceedings Paper
Author
Martínez, Conrado (E-UPB-C) AMS Author Profile; Nebel, Markus E. (D-RPTU-CS) AMS Author Profile; Wild, Sebastian (D-RPTU-CS) AMS Author Profile
Source
Subject
68 Computer science -- 68Q Theory of computing
68Q87Probability in computer science
68Q87
Language
English
Abstract
In this paper an investigation is made of the impact of pivot choiceand modern processor architecture in the well-known sorting algorithmQuicksort. Quicksort is one of the most intensively used sortingalgorithms; it is also used as a default sorting algorithm andimplemented in many programming language libraries. The Quicksortalgorithm is based on the ``divide and conquer'' technique, using oneelement of the input as a pivot according to which the input ispartitioned into two smaller subarrays, which are then sortedrecursively by the same procedure. The idea of using more than onepivot is shown to be less efficient, e.g., with two pivots we save$5\%$ of comparisons, but at the same time increase the number of swapsup to $80\%$. From a practical point of view, the dual-pivot Quicksortalgorithm experiments show $10\%$ time improvement (Yaroslavskiy'sdual-pivot Quicksort and their implementation in Oracle's Java 7library). This result, which is still not entirely clear, is supposedto be a consequence of the architecture of modern processors. In thispaper, the authors give an analytical investigation of the influence ofpipelining and the resulting branch mispredictions on the efficiency of(classic) Quicksort and Yaroslavskiy's dual-pivot Quicksort. Theyaddress the effect of pipelines by using different branch predictionstrategies, and analyze the expected number of branch misses. They giveprecise asymptotics for the expected number of branch misses caused bythe aforementioned Quicksort variants when their pivots are chosen froma sample of the input. Their conclusion is that the difference inbranch misses is too small to explain the time superiority of thedual-pivot algorithm.