Trying to resolve the compilation error:
error: more than one conversion function from "__nv_bfloat16" to "uint8_t"
. The error seems to related to the architecture of the gpu as mentioned here: https://github.com/NVIDIA/cutlass/issues/4
But it does not seem to resolve the problem.
Relevant compiler flag:
-- CUDA supported arches: 7.0;7.5;8.0;8.6;8.9;9.0
-- CUDA target arches: 86-real
-- CMake Version: 3.30.3
-- CUTLASS 3.5.1
What could be the problem?
I wish I could provide a small example to reproduce the problem but it does not seem to be applicable in this case.