I was going through the analysis of quicksort in Sedgewick’s Algorithms book. He creates the following recurrence relation for number of compares in quicksort while sorting an array of N distinct items.
I am having a tough time understanding this… I know it takes 1/N probability for any element to become the pivot and that if k becomes the pivot, then the left sub-array will have k-1 elements and right sub-array will have N-k elements.
1.How does the cost of partitioning become N+1 ? Does it take N+1 compares to do the partitioning?
2.Sedgewick says, for each value of k, if you add those up, the probability that the partitioning element is k + the cost for the two sub-arrays you get the above equation.
- Can someone explain this so that those with less math knowledge (me) can understand?
- Specifically how do you get the second term in the equation?
- What exactly does that term stand for?
1
The cost function C
for quicksort consists of two parts. The first part is the cost of partitioning the array in two ‘halves’ (the halves don’t have to be of equal size, hence the quotes). The second part is the cost of sorting those two halves.
-
The
(N + 1)
term is actually a condensed term, and comes from the terms(N - 1) + 2
This is the cost of the partitioning in quicksort:
N-1
compares with the pivot value, and 2 additional compares due to some boundary conditions in the partitioning. -
The second part of the equation consists of the costs for sorting the two ‘halves’ on either side of the pivot value
k
.After choosing a pivot value, you are left with two unsorted ‘halves’. The cost of sorting these ‘halves’ depends on their size and is easiest described as a recursive application of the cost function
C
. If the pivot is the smallest of theN
values, the costs for sorting each of the two ‘halves’ is respectivelyC(0)
andC(N-1)
(the cost for sorting an array with 0 elements and the cost for sorting one withN-1
elements).
If the pivot is the fifth-smallest, then the cost for sorting each of the two ‘halves’ is respectivelyC(5)
andC(N-6)
(the cost for sorting an array with 5 elements and the cost for sorting one withN-6
elements). And similarly for all other pivot values.But how much does it cost to sort those two ‘halves’ if you don’t know the pivot value? This is done by taking the cost for each possible value of the pivot and multiplying that by the chance that that particular value turns up.
As each pivot value is equally likely, the chance for choosing a particular pivot value is
1/N
if you haveN
elements. To understand this, think about rolling a dice. With a proper dice, the chance for each side to end up facing up is equal, so the chance to roll a 1 is 1/6.Combined, this gives the summation term where, for each possible value k of the pivot, the cost (
C(k-1) + C(N-k)
) is multiplied by the chance (1/N
) -
The further derivation of the summation formulat in the question to the
2N lnN
in the title takes too much math to explain herein detail, but it is based on the understanding that the cost for sorting an array ofN
elements (C(N)
) can be expressed in terms of sorting an array ofN-1
elements (C(N-1)
) and a factor that is directly proportional toN
.
-
It seems that N+1 as the number of comparisons for the partition step is an error in the book. You need to find out for each of the N–1 non-pivot elements whether it is less than or greater than the pivot, which takes one comparison; thus N–1 comparisons in total, not N+1. (Consider the simplest case, N=2, i.e. one pivot and one other element: There is absolutely no room for doing three comparisons between two elements.)
-
Consider the case where the chosen pivot happens to be the smallest element (k=1). This means that the array is divided into an empty part to the left (there are no elements that are less than the pivot) and a part to the right that contains all the elements except for the pivot (all other elements are greater than the pivot). This means that the sub-problems that you now want to solve recursively have sizes 0 and N–1 (k–1 and N–k), respectively, and require C(0) and C(N–1) comparisons; thus, C(0)+C(N–1) in total.
If the pivot happens to be the second smallest element (k=2), the sub-problem sizes are 1 and N–2 (k–1 and N–k; one element on the left, because it is the only one smaller than the pivot). Thus, recursively solving these sub-problems requires C(1)+C(N–2) comparisons. And so on if the pivot is the third smallest element, the fourth, etc. These are the expressions in the numerators.
Because the pivot is chosen randomly from among the N elements, each case (pivot is smallest, pivot is second smallest, etc.) occurs with equal probability 1/N. That’s where the N in the denominators comes from.