Trying to understand the 2N lnN compares for quicksort

I was going through the analysis of quicksort in Sedgewick’s Algorithms book. He creates the following recurrence relation for number of compares in quicksort while sorting an array of N distinct items.

I am having a tough time understanding this… I know it takes 1/N probability for any element to become the pivot and that if k becomes the pivot, then the left sub-array will have k-1 elements and right sub-array will have N-k elements.

1.How does the cost of partitioning become N+1 ? Does it take N+1 compares to do the partitioning?

2.Sedgewick says, for each value of k, if you add those up, the probability that the partitioning element is k + the cost for the two sub-arrays you get the above equation.

Can someone explain this so that those with less math knowledge (me) can understand?
Specifically how do you get the second term in the equation?
What exactly does that term stand for?

The cost function C for quicksort consists of two parts. The first part is the cost of partitioning the array in two ‘halves’ (the halves don’t have to be of equal size, hence the quotes). The second part is the cost of sorting those two halves.

The (N + 1) term is actually a condensed term, and comes from the terms
```
(N - 1) + 2
```
This is the cost of the partitioning in quicksort: N-1 compares with the pivot value, and 2 additional compares due to some boundary conditions in the partitioning.
The second part of the equation consists of the costs for sorting the two ‘halves’ on either side of the pivot value k.

After choosing a pivot value, you are left with two unsorted ‘halves’. The cost of sorting these ‘halves’ depends on their size and is easiest described as a recursive application of the cost function C. If the pivot is the smallest of the N values, the costs for sorting each of the two ‘halves’ is respectively C(0) and C(N-1) (the cost for sorting an array with 0 elements and the cost for sorting one with N-1 elements).
If the pivot is the fifth-smallest, then the cost for sorting each of the two ‘halves’ is respectively C(5) and C(N-6) (the cost for sorting an array with 5 elements and the cost for sorting one with N-6 elements). And similarly for all other pivot values.

But how much does it cost to sort those two ‘halves’ if you don’t know the pivot value? This is done by taking the cost for each possible value of the pivot and multiplying that by the chance that that particular value turns up.

As each pivot value is equally likely, the chance for choosing a particular pivot value is 1/N if you have N elements. To understand this, think about rolling a dice. With a proper dice, the chance for each side to end up facing up is equal, so the chance to roll a 1 is 1/6.

Combined, this gives the summation term where, for each possible value k of the pivot, the cost (C(k-1) + C(N-k)) is multiplied by the chance (1/N)
The further derivation of the summation formulat in the question to the 2N lnN in the title takes too much math to explain herein detail, but it is based on the understanding that the cost for sorting an array of N elements (C(N)) can be expressed in terms of sorting an array of N-1 elements (C(N-1)) and a factor that is directly proportional to N.

It seems that N+1 as the number of comparisons for the partition step is an error in the book. You need to find out for each of the N–1 non-pivot elements whether it is less than or greater than the pivot, which takes one comparison; thus N–1 comparisons in total, not N+1. (Consider the simplest case, N=2, i.e. one pivot and one other element: There is absolutely no room for doing three comparisons between two elements.)
Consider the case where the chosen pivot happens to be the smallest element (k=1). This means that the array is divided into an empty part to the left (there are no elements that are less than the pivot) and a part to the right that contains all the elements except for the pivot (all other elements are greater than the pivot). This means that the sub-problems that you now want to solve recursively have sizes 0 and N–1 (k–1 and N–k), respectively, and require C(0) and C(N–1) comparisons; thus, C(0)+C(N–1) in total.

If the pivot happens to be the second smallest element (k=2), the sub-problem sizes are 1 and N–2 (k–1 and N–k; one element on the left, because it is the only one smaller than the pivot). Thus, recursively solving these sub-problems requires C(1)+C(N–2) comparisons. And so on if the pivot is the third smallest element, the fourth, etc. These are the expressions in the numerators.

Because the pivot is chosen randomly from among the N elements, each case (pivot is smallest, pivot is second smallest, etc.) occurs with equal probability 1/N. That’s where the N in the denominators comes from.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: softwareengineering - @ 22:51

Thẻ: algorithm-analysis, comparison, math, sorting

Trying to understand the 2N lnN compares for quicksort

1.How does the cost of partitioning become N+1 ? Does it take N+1 compares to do the partitioning?

2.Sedgewick says, for each value of k, if you add those up, the probability that the partitioning element is k + the cost for the two sub-arrays you get the above equation.

Can someone explain this so that those with less math knowledge (me) can understand?
Specifically how do you get the second term in the equation?
What exactly does that term stand for?

The (N + 1) term is actually a condensed term, and comes from the terms
```
(N - 1) + 2
```
This is the cost of the partitioning in quicksort: N-1 compares with the pivot value, and 2 additional compares due to some boundary conditions in the partitioning.
The second part of the equation consists of the costs for sorting the two ‘halves’ on either side of the pivot value k.

After choosing a pivot value, you are left with two unsorted ‘halves’. The cost of sorting these ‘halves’ depends on their size and is easiest described as a recursive application of the cost function C. If the pivot is the smallest of the N values, the costs for sorting each of the two ‘halves’ is respectively C(0) and C(N-1) (the cost for sorting an array with 0 elements and the cost for sorting one with N-1 elements).
If the pivot is the fifth-smallest, then the cost for sorting each of the two ‘halves’ is respectively C(5) and C(N-6) (the cost for sorting an array with 5 elements and the cost for sorting one with N-6 elements). And similarly for all other pivot values.

But how much does it cost to sort those two ‘halves’ if you don’t know the pivot value? This is done by taking the cost for each possible value of the pivot and multiplying that by the chance that that particular value turns up.

As each pivot value is equally likely, the chance for choosing a particular pivot value is 1/N if you have N elements. To understand this, think about rolling a dice. With a proper dice, the chance for each side to end up facing up is equal, so the chance to roll a 1 is 1/6.

Combined, this gives the summation term where, for each possible value k of the pivot, the cost (C(k-1) + C(N-k)) is multiplied by the chance (1/N)
The further derivation of the summation formulat in the question to the 2N lnN in the title takes too much math to explain herein detail, but it is based on the understanding that the cost for sorting an array of N elements (C(N)) can be expressed in terms of sorting an array of N-1 elements (C(N-1)) and a factor that is directly proportional to N.

It seems that N+1 as the number of comparisons for the partition step is an error in the book. You need to find out for each of the N–1 non-pivot elements whether it is less than or greater than the pivot, which takes one comparison; thus N–1 comparisons in total, not N+1. (Consider the simplest case, N=2, i.e. one pivot and one other element: There is absolutely no room for doing three comparisons between two elements.)
Consider the case where the chosen pivot happens to be the smallest element (k=1). This means that the array is divided into an empty part to the left (there are no elements that are less than the pivot) and a part to the right that contains all the elements except for the pivot (all other elements are greater than the pivot). This means that the sub-problems that you now want to solve recursively have sizes 0 and N–1 (k–1 and N–k), respectively, and require C(0) and C(N–1) comparisons; thus, C(0)+C(N–1) in total.

If the pivot happens to be the second smallest element (k=2), the sub-problem sizes are 1 and N–2 (k–1 and N–k; one element on the left, because it is the only one smaller than the pivot). Thus, recursively solving these sub-problems requires C(1)+C(N–2) comparisons. And so on if the pivot is the third smallest element, the fourth, etc. These are the expressions in the numerators.

Because the pivot is chosen randomly from among the N elements, each case (pivot is smallest, pivot is second smallest, etc.) occurs with equal probability 1/N. That’s where the N in the denominators comes from.

Filed under: softwareengineering - @ 22:51

Thẻ: algorithm-analysis, comparison, math, sorting

Trying to understand the 2N lnN compares for quicksort

1.How does the cost of partitioning become N+1 ? Does it take N+1 compares to do the partitioning?

2.Sedgewick says, for each value of k, if you add those up, the probability that the partitioning element is k + the cost for the two sub-arrays you get the above equation.

Can someone explain this so that those with less math knowledge (me) can understand?
Specifically how do you get the second term in the equation?
What exactly does that term stand for?

The (N + 1) term is actually a condensed term, and comes from the terms
```
(N - 1) + 2
```
This is the cost of the partitioning in quicksort: N-1 compares with the pivot value, and 2 additional compares due to some boundary conditions in the partitioning.
The second part of the equation consists of the costs for sorting the two ‘halves’ on either side of the pivot value k.

After choosing a pivot value, you are left with two unsorted ‘halves’. The cost of sorting these ‘halves’ depends on their size and is easiest described as a recursive application of the cost function C. If the pivot is the smallest of the N values, the costs for sorting each of the two ‘halves’ is respectively C(0) and C(N-1) (the cost for sorting an array with 0 elements and the cost for sorting one with N-1 elements).
If the pivot is the fifth-smallest, then the cost for sorting each of the two ‘halves’ is respectively C(5) and C(N-6) (the cost for sorting an array with 5 elements and the cost for sorting one with N-6 elements). And similarly for all other pivot values.

But how much does it cost to sort those two ‘halves’ if you don’t know the pivot value? This is done by taking the cost for each possible value of the pivot and multiplying that by the chance that that particular value turns up.

As each pivot value is equally likely, the chance for choosing a particular pivot value is 1/N if you have N elements. To understand this, think about rolling a dice. With a proper dice, the chance for each side to end up facing up is equal, so the chance to roll a 1 is 1/6.

Combined, this gives the summation term where, for each possible value k of the pivot, the cost (C(k-1) + C(N-k)) is multiplied by the chance (1/N)
The further derivation of the summation formulat in the question to the 2N lnN in the title takes too much math to explain herein detail, but it is based on the understanding that the cost for sorting an array of N elements (C(N)) can be expressed in terms of sorting an array of N-1 elements (C(N-1)) and a factor that is directly proportional to N.

It seems that N+1 as the number of comparisons for the partition step is an error in the book. You need to find out for each of the N–1 non-pivot elements whether it is less than or greater than the pivot, which takes one comparison; thus N–1 comparisons in total, not N+1. (Consider the simplest case, N=2, i.e. one pivot and one other element: There is absolutely no room for doing three comparisons between two elements.)
Consider the case where the chosen pivot happens to be the smallest element (k=1). This means that the array is divided into an empty part to the left (there are no elements that are less than the pivot) and a part to the right that contains all the elements except for the pivot (all other elements are greater than the pivot). This means that the sub-problems that you now want to solve recursively have sizes 0 and N–1 (k–1 and N–k), respectively, and require C(0) and C(N–1) comparisons; thus, C(0)+C(N–1) in total.

If the pivot happens to be the second smallest element (k=2), the sub-problem sizes are 1 and N–2 (k–1 and N–k; one element on the left, because it is the only one smaller than the pivot). Thus, recursively solving these sub-problems requires C(1)+C(N–2) comparisons. And so on if the pivot is the third smallest element, the fourth, etc. These are the expressions in the numerators.

Because the pivot is chosen randomly from among the N elements, each case (pivot is smallest, pivot is second smallest, etc.) occurs with equal probability 1/N. That’s where the N in the denominators comes from.

Filed under: softwareengineering - @ 22:51

Thẻ: algorithm-analysis, comparison, math, sorting

Thiết kế website giá rẻ

Danh mục

Trying to understand the 2N lnN compares for quicksort

Trying to understand the 2N lnN compares for quicksort

Trying to understand the 2N lnN compares for quicksort