I am working, out of passion only, on a simple game engine to try ideas and implement a sample asteroids like game.
The recommended way is component based, streaming data etc. for good cache coherence.
I chose a monolithic object approach with data members ordered in slices as used by the engine sub-systems in the game loop with cache coherence in mind.
In the actual game it doesn’t really matter but I ran some performance tests using 64K cubes in a multi-threaded grid of cells.
To get metrics for cache coherence I ran valgrind --tool=cachegrind --cache-sim=yes
and the results are pasted below.
I have no reference for where those numbers are on a scale from “bad” to “decent” to “good”.
What can be expected from an optimized engine?
Kind regards
==1094009== I refs: 16,144,217,054
==1094009== I1 misses: 3,843,166
==1094009== LLi misses: 375,418
==1094009== I1 miss rate: 0.02%
==1094009== LLi miss rate: 0.00%
==1094009==
==1094009== D refs: 7,066,247,563 (5,233,276,534 rd + 1,832,971,029 wr)
==1094009== D1 misses: 56,909,087 ( 36,616,715 rd + 20,292,372 wr)
==1094009== LLd misses: 24,687,648 ( 18,660,335 rd + 6,027,313 wr)
==1094009== D1 miss rate: 0.8% ( 0.7% + 1.1% )
==1094009== LLd miss rate: 0.3% ( 0.4% + 0.3% )
==1094009==
==1094009== LL refs: 60,752,253 ( 40,459,881 rd + 20,292,372 wr)
==1094009== LL misses: 25,063,066 ( 19,035,753 rd + 6,027,313 wr)
==1094009== LL miss rate: 0.1% ( 0.1% + 0.3% )
1