Quicksort sorts an array by first choosing a ``pivot'' element x, and then ``partitioning'' the other elements into those less than x (which end up toward the left of the array) and those greater than x (which end up toward the right), and then x goes in between the two groups. Then we make two recursive calls, on the left subarray and on the right subarray. Once the pivot has been put in between the left and right sections, that is its final correct place. (Of course, each recursive call creates a new pivot for that left or right part.)
Quicksort (
)
1 if
then return
2
Partition (
)
3 Quicksort(
)
4 Quicksort(
)
Here is how Partition works: We partition on the first element. Take it out. Look for the rightmost element that is smaller and place it in the first position (which is the newly opened position on the left side). Look for the leftmost element that is larger and place it in the newly opened position on the right side. Starting from there look for the rightmost element that is smaller and place it in the the newly opened position on the left side. Starting from there look for the left mostelement that is larger and place it in the newly opened position on the right side. Continue in this fashion until the pointers cross. Finally, put the partition element into the hole, which is its final position in the sorted array.
There are many versions of Partition; the one described above (and coded below) is especially nice in that it leads to a relatively easy mathematical analysis for the average-case number of comparisons (using probabilities, calculus, summations, and strong constructive induction!)
Partition (A, start, end)
x := A[start]
left := start
right := end + 1
while left < right
do right := right - 1 until {left >= right or A[right] <= x}
if left < right then do A[left] := A[right] else end_while
do left := left + 1 until {left >= right or A[left] >= x}
if left < right then do A[right] := A[left]
A[left] := x
return left
The worst-case runtime for quicksort is not good:
. And this
occurs, surprisingly, when the data is already sorted! For then the
pivot is the smallest value and Partition does little good except to
split off that one value before the next (recursive) call. Thus
Partition has to run
times, and each time it makes
comparisons (where
runs from
down to 1). The average case more
than makes up for this defect, however, as we shall see below.
Further, there is a technique that can help avoid the worst case, namely randomizing the array before each call to Partition. In fact, all we really need is to choose a pivot value that is likely to be near the median; and for this we could simply exchange A[1] with a randomly-chosen array value. An even better approach is to choose (say) three array values at random and use the second largest (ie the mid-sized one).
Such modified versions of algorithms are called randomized algorithms.
Consider quicksort applied
to
values. Let
be the average number of comparisons the
algorithm uses. We know
. The partition operation
is done with exactly
comparisons. Assume that the partition
value turns out to have sorted rank (order position)
. Then we
have its location, but we still have to recursively quicksort the
smallest values and the
largest values. So, for a fixed
,
the remaining steps of quicksort take
comparisons on
average.
But the value of
is equally likely to be any integer from
to
.
So, averaging over all possible values of
, the expected number of
remaining steps of quicksort takes is
comparisons.
We claim that
for some constant
and
.
Proof by constructive induction.
Base case:
:
and
.
Induction step:
Assume it holds for all positive integers less than
.
Then