I want to strictly use Tensor Cores for running inference of a pretrained full precision CNN model in Pytorch
I have been analyzing the maximum throughput I can get from my device for a specific CNN model using a GPU. My GPU has CUDA cores as well as Tensor cores. So I want to simultaneous run the model on both the type of cores simultaneously and check the maximum possible throughput I can get.
I want to strictly use Tensor Cores for running inference of a pretrained full precision CNN model in Pytorch
I have been analyzing the maximum throughput I can get from my device for a specific CNN model using a GPU. My GPU has CUDA cores as well as Tensor cores. So I want to simultaneous run the model on both the type of cores simultaneously and check the maximum possible throughput I can get.