OpenCL has had a bumpy ride over the years w.r.t. to the prospects of using C++ to write kernels. First there was “OpenCL C++ kernel language”, standardized with OpenCL v2.1 – but that did not quite catch, and AFAICR never got supported on NVIDIA GPUs; and then, with OpenCL 3.0 – that was scratched entirely, in favor of C++ for OpenCL, which is “based on a Clang/LLVM compiler which implements a subset of C++17 and SPIR-V intermediate code” – and was announced/published as late as 2021.
Anyway, let’s consider where we stand now, i.e. December 2024. Can I use one or the other of the C++-ish kernel languages, and compile it – using NVIDIA’s own tools and libraries or even clang/LLVM and then NVIDIA stuff, so that it can run on NVIDIA GPUs? And if so – how, roughyl, is that done?
From what I gather, it should be something like:
C++-for-OpenCL source code
==[A compiler such as clang which supports C++-for-OpenCL]==>
An intermediate representation, e.g. SPIR-V or PTX
==[One or more facilities by NVIDIA]==>
kernel running on the GPU
but I’m not sure even about that, nor do I know how to make it concrete and specific.
Notes:
- I am not interested in alternative ways to get GPUs to run stuff written in C++, e.g. using CUDA kernels, NVIDIA’s thrust, SYCL, OpenMP, OpenACC etc. Those are all fine and good but not what this question is about.
- I asked a similar question in 2019, when the state of affairs was different than today, and got an answer relevant to that time. I think this separate question is appropriate (but others may think differently).
- If I could get PTX from C++-from-OpenCL, I know the rest of the way, so if you know how to make that step, I can fill in the rest of the answer: Basically, you load the PTX into memory and use
cuModuleLoad()
.
2