Relative Content

Tag Archive for c++cuda

How to properly free a Cuda context?

I am implementing Optix denoising inside my C++ path tracer. I then need to create a Cuda context before calling Optix kernels. That context should be created every time i spawn a rendering thread since each thread have its own Cuda context

identifier “atomicAdd” in cuda

I was running the k-means algorithm using cuda and encountered a problem in this part of the code before for if (idx < numPoints) { atomicAdd(&counts[points[idx].cluster], 1);
code:

identifier “atomicAdd” in cuda

I was running the k-means algorithm using cuda and encountered a problem in this part of the code before for if (idx < numPoints) { atomicAdd(&counts[points[idx].cluster], 1);
code:

Perform quick flip operations on matrices using CUDA

I want to perform A fast flip operation similar to Matlab for 3D matrix in CUDA C++, but I have encountered a speed bottleneck and need to ask for help. The following will take 222 matrix A to demonstrate the flip function as an example (A = reshape(1:8,2,2,2):

CUDA copy class object containg pointer to another class

I am trying to copy a class object containing pointers to another. In particular, I have a class LikelihoodConstructor which contains an array of pointers to another class, DataModel which contains an array ‘bins’ which im trying to access. essentially in the kernel I would like to run is the following :

Summation of a polynomial in CUDA

I would like to perform a summation operation on a polynomial inside a cuda kernel which contains coefficients and function as given