Does the cuda sample program have a bandwidth test program with TMA?
I am looking for sample code for CUDA testing TMA bandwidth, where can I find it, or does Nvidia not provide this code?
I am looking for sample code for CUDA testing TMA bandwidth, where can I find it, or does Nvidia not provide this code?