OpenCL : high fix cost to run gpu instructions, what to do? I’m using OpenCL in c++. My GPU is NVIDIA GeForce RTX 3070.