So I am quite new to the Parallel programming world. One thing I can not wrap my head around is the concept of threads, thread blocks and grid blocks.
So I kinda understood the hardware model with different SMs. Lets take an example of a GPU which has 80SMs. There are 64 cores per SM. I read somewhere that the number of cores per SM are equal to the number of threads that are run at a given time. So at one instant of time 64 threads are run in parallel (in one SM). But where does this concept of thread block come into place. That means there are 32 warps running in parallel and thread block is a combination of these 32 warps from different SMs? This concept is a bit unclear to me.
If I am absolutely wrong kindly let me know a resource that can help me sort things out. Thanks
SandHu is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.