학술논문

Analysis of branch misses in Quicksort.
Document Type
Proceedings Paper
Author
Martínez, Conrado (E-UPB-C) AMS Author Profile; Nebel, Markus E. (D-RPTU-CS) AMS Author Profile; Wild, Sebastian (D-RPTU-CS) AMS Author Profile
Source
2015 Proceedings of the Twelfth Workshop on Analytic Algorithmics and Combinatorics (ANALCO) (20150101), 114-128.
Subject
68 Computer science -- 68Q Theory of computing
  68Q87 Probability in computer science
Language
English
Abstract
In this paper an investigation is made of the impact of pivot choice and modern processor architecture in the well-known sorting algorithm Quicksort. Quicksort is one of the most intensively used sorting algorithms; it is also used as a default sorting algorithm and implemented in many programming language libraries. The Quicksort algorithm is based on the ``divide and conquer'' technique, using one element of the input as a pivot according to which the input is partitioned into two smaller subarrays, which are then sorted recursively by the same procedure. The idea of using more than one pivot is shown to be less efficient, e.g., with two pivots we save $5\%$ of comparisons, but at the same time increase the number of swaps up to $80\%$. From a practical point of view, the dual-pivot Quicksort algorithm experiments show $10\%$ time improvement (Yaroslavskiy's dual-pivot Quicksort and their implementation in Oracle's Java 7 library). This result, which is still not entirely clear, is supposed to be a consequence of the architecture of modern processors. In this paper, the authors give an analytical investigation of the influence of pipelining and the resulting branch mispredictions on the efficiency of (classic) Quicksort and Yaroslavskiy's dual-pivot Quicksort. They address the effect of pipelines by using different branch prediction strategies, and analyze the expected number of branch misses. They give precise asymptotics for the expected number of branch misses caused by the aforementioned Quicksort variants when their pivots are chosen from a sample of the input. Their conclusion is that the difference in branch misses is too small to explain the time superiority of the dual-pivot algorithm.

Online Access