The following picture is a snapshot of source code in nsight-compute.
Line 849 is to load one integer from global memory and assign to a register. but NCU shows there are 5 global memory accesses. Is it possible to reduce the global memory access from 5 to 1?
source code and sass in ncu
Thanks