I have been experimenting with a simple true/false sharing benchmark, which does regular load+increment+write on a pointer. Basically this:
static void do_increments(size_t *buffer, size_t iterations)
{
while (iterations) {
buffer[0]++;
--iterations;
}
}
This function is called right after waiting on a barrier from two threads that are pinned to to different physical cores. Depending on the value of the buffer
pointer, this can be compiled to excibit true sharing (same address for both cores), false sharing (same cacheline, but different addresses), or no sharing (different cachelines).
When run on x86, true and false sharing scenarios show slowdown when compared to no sharing. However, on some ARM cores, like the Cortex A73, no slowdown is seen regardless of the address of buffer
. I have also seen some RISC-V cores excibit the same behaviour (no slowdown).
To try and understand why some platforms slow down and some others don’t, I tried to gain a deeper understanding of why exactly false sharing causes slowdown, and for x86 it is nicely explained in Why does false sharing still affect non atomics, but much less than atomics?
Basically, on x86 chips you get stalls from false sharing because of either:
(correct my if I’m wrong on this please!)
- Memory ordering machine clears, which happen when a write from a different core becomes visible after a load from our core has started, but before it has determined its value
- When our store buffer has filled up, meaning we have to flush it, and do the rounds to keep the cache coherent (i.e. invalidate the cacheline in other cores and wait for invalidate ack)
ARM cores I tested seem to have all the same implementation details: a store buffer that can be forwarded from, and coherent caches. Maybe there are some relaxed memory ordering rules that help prevent these stalls on an ARM core?
Moreover, some other cores (e.g. Cortex A76) I tested do show slowdown from false sharing. Presumably, they obey the same memory ordering rules, so it has to be some microarchitectural detail that causes slowdown from false sharing?
aolo2 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.