A question regarding linked list vs vector insert/remove results comparison

I was reading this blog post: http://kjellkod.wordpress.com/2012/02/25/why-you-should-never-ever-ever-use-linked-list-in-your-code-again/

and I found there a code to run: http://ideone.com/62Emz

I’ve compiled it using gcc 4.7.2 with g++ -std=c++11 on my old laptop with T5450 cpu with two cores with 32 Kbytes L1 cache each and 2 Megabytes of (common?) L2 cache and I have got this results:

********** Times in microseconds**********
Elements ADD (List, Vector)     ERASE(List, Vector)
100,     , 0, 0,                0, 0
200,     , 0, 0,                15625, 0
500,     , 0, 0,                0, 0
1000,    , 15625, 0,            0, 15625
4000,    , 109375, 140625,              46875, 31250
10000,   , 750000, 875000,              312500, 187500
20000,   , 2968750, 3468750,            1296875, 781250
40000,   , 12000000, 13843750,          5359375, 3156250
Exiting test,. the whole measuring took 45375 milliseconds (45seconds or 0 minut
es)

Which actually says an opposite, at least for ADD operation comparing to what author of that blog post says. List is faster for ADD than Vector. What conclusions should I make from my results? Does it proof anything? What should I think or understand?

First of all, congratulations for doing the right thing and measuring rather than believing advice about efficiency! With modern computer architecture, it is harder than ever to predict how precisely a small change in data structures will affect runtimes, because of the many levels of the memory hierarchy, out-of-order execution, aggressive code optimizers etc. If your use case is indeed what you have measured, then yes, you will be better off with a list.

That said, that program doesn’t do a lot; in practice you will almost certainly access elements more often than you add or delete them, and probably in non-successive ways. I suspect that if you rerun benchmarks with a lot of random accesses, the results might turn around, but… remember what I just said? Never assume. Always measure. Happy profiling!

The title of that article is intentionally disingenuous. The author is making three points: that linked list traversals and searches are slow, which is true, that most people don’t realize how much slower they are than even linear array searches due to cache misses, which is probably true, and that most people don’t take search time into account when selecting a linked list, which is probably false.

His benchmark program includes the search time together with adds and deletes, which is the disingenuous part. When I select a linked list, the very first thing I ask myself is if the poor search time can either be worked around or if it is an acceptable trade off. I think that is true of most people.

For example, the last time I used a linked list was when I needed to keep some items sorted by how recently they were last accessed. The most common operation by far was moving an item from the middle of the list to the front, an O(1) operation for a linked list. However, there’s that pesky search time. It turned out to be convenient to store a pointer to the linked list node in another data structure I needed anyway, so the searches would be O(1) as well.

However, in other circumstances, I have just kept the O(n) searches because the semantics of a linked list simplified the code, and the performance hit was negligible. Your own test shows the difference measured in hundredths of a second for adding or deleting 4000 nodes. For most applications that’s completely unnoticeable.

The fact that you got different results than the author on his own benchmark is also very interesting, and illustrates well why you should do your own measuring. Your compiler, operating system, and standard library implementation, and even the other processes running on your system can all make a significant difference in things like how many cache misses your code will generate.

First a suggestion: You probably want to wait more than a couple of minutes before accepting an answer: You’ll probably get more answers that way.

Second: when benchmarking, you should always compile with optimizations turned on. Something like g++ -std=c++11 -O3 -march=native should get you good results.

std::list really is a poor data structure, and I have not yet found a situation in which it is the best. For instance, consider the case where you want to maintain a sorted data structure. You may think that std::list, with O(1) insertion and deletion time would be ideal, but in fact, it is sub-optimal!

For these tests, my contained type is a trivial class that contains an array of 4-byte integers. I test a std::array of size 1, 10, and 100 (giving me an element size of 4, 40, and 400). I chose a std::array because a move and a copy are the same thing. The initial element of the array is initialized to some random number between 0 and std::numeric_limits<uint32_t>::max(). I create a std::vector of some number of these (the x-axis), then I start the timer. I test iterating over that std::vector for each element and inserting it in order (as sorted by that first element in the array, using operator<=). To help avoid any clever compiler optimizations of removing any work, I then output the first element of the sorted container (which cannot be determined until the end) to some file and stop the timer.

These are my results for various sizes of elements:

We see that for 4 and 40 byte elements, std::vector is better even at this inserting into the middle than std::list, and for any element size you’re better off using a std::vector<unique_ptr> than std::list.

In general, I cannot come up with a reason to use std::list over a class that wraps std::vector<std::unique_ptr> to make it appear as though it has value semantics, other than the ability to copy (which I hope to fix by either submitting a value_ptr class to Boost if one isn’t added soon, although there is discussion about that).

As an additional note, if this were real code, not a comparison / benchmark game, I would have written the std::vector version much more differently. I would have copied the entire original container directly, then used std::sort. I intend to write a more complete analysis on data structures, with a focus on the importance of data locality, and I will include timing on the “correct” way to do it. (the correct version blows all other methods out of the water, completing in much less than a second for 400,000 elements of size 400, which is 10 times more elements than I tested in my graphs).

I hope I explained everything well; these are some tests I ran several months ago and haven’t yet finished my notes on the subject.

Tests were done on an Intel i5 machine with 4 GiB of RAM. I believe I was using Fedora 17 x64.

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị

Filed under: softwareengineering - @ 20:16

Thẻ: algorithms, c++, performance

Thiết kế website giá rẻ

Danh mục

A question regarding linked list vs vector insert/remove results comparison