cudaMalloc() fails with unknown error using slurm but works correctly using mpirun
I’m running a slurm controller with couple of gpu nodes. All nodes share $HOME directory with intel mpi installed. A simple mpi version of cuda program using cudaMalloc() works correctly when run directly with mpirun but fails with srun / sbatch.