I am currently working on a medical image analysis project where GPU acceleration is crucial for performance. Despite enabling the GPU through the green icon in the top right corner of the Lightning.AI interface, Torch indicates that CUDA is not available (torch.cuda.is_available()
returns False
). Below are the details of my setup and the issue I’m encountering:
- CUDA Compiler Version: Running
nvcc
in the terminal reports CUDA compilation tools release 12.4. - GPU Information: Using
lspci
, my system detects an NVIDIA Tesla T4 GPU (revision a1). - NVIDIA Driver and CUDA Version: The
nvidia-smi
tool indicates that the NVIDIA Driver Version is 535.183.06 and the CUDA Version is 12.2.
Despite these components being correctly installed and seemingly operational, Torch fails to detect the GPU. I have verified the installation of the NVIDIA driver and CUDA toolkit multiple times.
Due to the complexity of the system and my unfamiliarity with Linux, I have refrained from making direct changes to my remote server setup on Lightning.AI Studio. Instead, I have focused on verifying and confirming the following:
-
GPU Activation: I have ensured that the GPU is enabled through the interface’s green icon, indicating that it should be operational.
-
Driver and CUDA Installation: I have verified the installation of the NVIDIA driver and CUDA toolkit. Tools such as
nvcc
andnvidia-smi
confirm that the driver version is 535.183.06 and CUDA version is 12.2, respectively.
Despite these verifications, Torch continues to report that CUDA is not available (torch.cuda.is_available()
returns False
). I expected that with the GPU enabled and the correct driver/CUDA versions installed, Torch would detect and utilize CUDA capabilities for GPU acceleration.
Given my limited experience with Linux administration and the critical nature of the remote server setup, I have not attempted to modify configurations directly. Instead, I am seeking advice on potential solutions or further diagnostics to correctly enable CUDA support in Torch on Lightning.AI Studio.
Ali Serwat is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.