Dual pivot quicksort in face of expensive swaps

I was told this is better place to ask this
TLDR
Has anyone tested dual pivot quicksort performance with expensive-to-swap elements? It seems that in this case, it should massively underperform compared to standard quicksort.

And yes, I know about Cycle Sort (if it is only the original array that is expensive to modify) and that I could use indices/pointers inside the array, sort them and then swap them into their correct place.

However, the first is completely out of the question (Quadratic average case is just not good enough) and the second is unsuitable for implementation of general case sort. (It imposes both performance and memory consumption overhead even in cases where it is better and faster to work with the original array).

Backstory
Inspired by recent “question” on stack overflow, I decided to go and implement non trivial versions of given sorts (introsort, quicksort with 3-way partition, median of 3 pivot selection, small block insertion sort etc).

During some research I also came upon dual pivot quicksort, which is the current implementation of quicksort in Java standard library. Generally it claims that it is always at least as good as standard quicksort, and empirical testing seemed to support it. (Which is the reason it is the current implementation.)

However, it seems that no STL implementation uses dual pivot quicksort for the quicksort phase of introsort, which made me wonder why. After more research I found this paper. It says that while dual pivot quicksort performs on average 5% less comparisons, it performs significantly more swaps. (Approximately 80% more) Obviously, since Java has only primitives and reference types, swapping is always cheap. (Even so, it uses this sort only for primitives, because it is not stable)

It also seems that at least part of the advantage of dual pivot quicksort is in its improved cache behaviour (Because it divides into smaller subarray that can fit into cache faster).

So I wanted to see whether someone already tested standard quicksort vs dual pivot quicksort when elements are expensive to swap and has the numbers (and possibly source) lying around, or whether I will have to test this myself.

3

Has anyone tested dual pivot quicksort performance with expensive-to-swap elements? It seems that in this case, it should massively underperform compared to standard quicksort.

During some research I also came upon dual pivot quicksort, which is the current implementation of quicksort in Java standard library. Generally it claims that it is always at least as good as standard quicksort, and empirical testing seemed to support it. (Which is the reason it is the current implementation.)

Be careful interpreting these claims. Many times, the comparison is against a “strawman” single-pivot quicksort implementation (usually called ”classic” (or as you say, ”standard”) quicksort) which uses a random pivot. This is sometimes also disguised by using unusual notation. E.g. reporting 2 N ln N comparisons for “single-pivot” quicksort. That translates to 1.386 N log (base 2) N comparisons, which is characteristic of selecting a single random element as the pivot. Random pivot selection not only has poor performance (even the worst performing single-pivot quicksort implementation of qsort in widespread use is closer to 1.15 N log (base 2) N comparisons), but it leads to difficult-to-maintain code (you want to replicate a bug; what implementation of random number generation was used, and what was its state at the time the bug happened?).

However, it seems that no STL implementation uses dual pivot quicksort for the quicksort phase of introsort, which made me wonder why. After more research I found this paper. It says that while dual pivot quicksort performs on average 5% less comparisons, it performs significantly more swaps. (Approximately 80% more) Obviously, since Java has only primitives and reference types, swapping is always cheap. (Even so, it uses this sort only for primitives, because it is not stable)

Again, be careful with these comparisons. “5% less” under what specific conditions? Under the best possible conditions (zero-cost, “perfect” pivot (median for single-pivot, tertiles for dual-pivot)) and uniformly-distributed random input, dual-pivot quicksort will use more than 5% more comparisons than single-pivot quicksort (5/3 N log (base 3) N ~~ 1.052 N log (base 2) N vs. 1 N log (base 2) N. Swapping also depends on the implementation. Single-pivot quicksort (conditions as specified above) is expected to use 0.25 N log (base 2) N swaps if implemented efficiently. A dual-pivot implementation could theoretically achieve 1/3 N log (base 3) N ~~ 0.21 N log (base 2) N swaps (16% less), but it requires a great deal of bookkeeping; more typical would be 0.28 N log (base 2) N (12% more). Note that there are many low-cost, highly-effective ways to approximate the median (i.e. pseudomedian) for single-pivot quicksort. Not so much for tertiles.

One probably wouldn’t want to use Musser introsort (recursion depth limit) with a multi-pivot scheme. Recursion depth isn’t well defined in such a case (consider that dual-pivot can behave as single-pivot if the two pivots happen to have close values, so would you compare to some multiple of the base 2 or the base 3 logarithm of array size?). Valois introsort (randomly shuffling elements) has other issues (see above re. replicating bugs).

It also seems that at least part of the advantage of dual pivot quicksort is in its improved cache behaviour (Because it divides into smaller subarray that can fit into cache faster).

That has been conjectured, but not definitively demonstrated. It may be a red herring; quicksort is mostly cache-oblivious as accesses tend to be sequential.

So I wanted to see whether someone already tested standard quicksort vs dual pivot quicksort when elements are expensive to swap and has the numbers (and possibly source) lying around, or whether I will have to test this myself.

There’s code (in C) including a testing framework at https://github.com/brucelilly/quickselect and multi-pivot issues are discussed in some detail in https://github.com/brucelilly/quickselect/blob/master/lib/libmedian/doc/pub/generic/paper.pdf. The code includes two dual-pivot implementations and many single-pivot implementations. A highly-tuned dual-pivot implementation indeed is a bit more than 1.052 N log N comparisons asymptotically, and several single-pivot implementations are lower, closer to 1 N log N comparisons (the paper includes several performance graphs). I haven’t attempted to minimize the swaps in the dual-pivot code; the necessary bookkeeping is really onerous, and as comparisons outnumber swaps, swaps would have to be really expensive to be a factor, and in such a case one would probably use indirection (rearranging pointers to data).

The answer is probably not, because it’s reasonably obvious that it wouldn’t matter except in rare cases.

Assume that the reason that swaps are expensive is that you are sorting objects that are large, contained in a database or accessed through an API. Regardless, those objects have keys and (by your statement) keys carry enough information that they can be compared cheaply.

Simply ensure that each key contains a reference to the underlying object. If necessary attach a pointer or index to each key. Then sort by any available method — it obviously doesn’t matter because the swaps are the expensive part.

Now perform a chase-your-tail swap on the sorted data. Start with the first key, swap it into the first position, swap the object that was there into its position and so on. You will have performed exactly the minimum number of swaps required to sort the data.

In other words, for expensive swaps there is a simple algorithm that ensures making the minimum swaps, that works regardless of sort algorithm. The only sort algorithms we care about to analyse are the ones where comparisons are expensive.

3

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa
Thiết kế website Thiết kế website Thiết kế website Cách kháng tài khoản quảng cáo Mua bán Fanpage Facebook Dịch vụ SEO Tổ chức sinh nhật