My goal is to use 5 1080×1080 textures and use the algorithm in the compute shader to generate a new 1080×1080 texture. Each pixel will most take the colors of 16 adjacent pixels from the same input texture to generate the new color. My problem is that I tried two ways to achieve this. The first one is to use 5 1080×1080 textures as input texture, and the second one is to use 1 5400×1080 texture as input texture. Yes, this 5400×1080 texture is made by horizontally stitching 5 1080×1080 textures, and the principle of generation is similar to splitscreen. Except for the difference in texture input and the slightly different algorithm, everything else is exactly the same.
After my test, the compute shader of the 5 small input texture takes 5ms, and the time of 1 large texture takes 8ms. Using pix on windows to profile, it is found that 5 small input texture are better than 1 large input textures in SM Instruction Execution Throughputs, L1TEX Cache Throughputs, L1TEX Cache lsu Throughputs, l1tex cache hit rate, l2 cache hit rate
Why are there these differences? Why are multiple small textures better than one large textures?
Jarvan Du is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.