I was about to write a data container for storing a continuous and re-sizable memory block, in which items would only be possible to access from one side by pushing or popping – basically a LIFO stack. I’ve been reading a lot about CPU cache and memory access patterns and wanted said container to be as cache-friendly as possible.
So I’ve taken a look at common implementations of LIFO stacks on the internet. Most of them suggest using a dynamic array as a base and accessing the data by appending or removing items from the end of the array. In this case the empty capacity gap is stored after the data in the following way:
stack bottom stack top
array begin V array end
V V V
|V data V |V gap |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|8 |7 |6 |5 |4 |3 |2 |1 |0 | | | | | |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| cache line |
However, this does not seem very cache-friendly to me. Whenever one looks up the item on the top of the stack, CPU would fetch the memory from gap buffer containing garbage, or at least this is how I understand it.
Would the approach where gap buffer appears as the begging of memory block and the data at the end be more performent? In this case the indexes of the items would be reversed, same with bottom and top. In this case the cache line could hit more items like so:
stack top stack bottom
V V
| gap |V data V |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| | | | | |0 |1 |2 |3 |4 |5 |6 |7 |8 |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
| cache line |
Is my understanding correct here? Which approach is more performent and cache-friendly?
I could be totally wrong with my assumptions. As I said most of the implementations that I saw use the 1st approach so there must be something in it.
Cheers!