I want to use GPUDirect Storage. I follow the instructions in https://docs.nvidia.com/gpudirect-storage/troubleshooting-guide/index.html#mofed-req-install to install it. The install details are as follow:
-
I firstly install cuda tookit and driver from here: https://developer.nvidia.com/cuda-downloads?target_os=Linux&target_arch=x86_64&Distribution=Ubuntu&target_version=22.04&target_type=deb_network.
-
Then, According to https://developer.nvidia.com/gpudirect-storage , GDS should be now part of CUDA. But when I have a look in gds file to check whether the gds file is right , I found:
(base) no@ho-4:/usr/local/cuda-12.6/gds$ ls -l total 32 -rw-r--r-- 1 root root 10756 Aug 22 13:36 cufile.json -rw-r--r-- 1 root root 14290 Aug 22 13:36 README drwxr-xr-x 2 root root 4096 Sep 23 15:59 tools
The right example one that nvidia docs give is:
$ ls -lh /usr/local/cuda-X.Y/gds/ total 20K -rw-r--r-- 1 root root 8.4K Mar 15 13:01 README drwxr-xr-x 2 root root 4.0K Mar 19 12:29 samples drwxr-xr-x 2 root root 4.0K mar 19 10:28 tools
I can’t find samples folders and find another unknown file cufile.json. Samples folder should contain example usage program to test the gds functionality. It is upset that I don’t have it. Could someone please help me to have my sample folder back ??o_o
By the way, I install MLNX_OFED because I need NVME support. when I installed MLNX_OFED using instructions in https://docs.nvidia.com/gpudirect-storage/troubleshooting-guide/index.html#mofed-req-install , I found that:
(base) no@ho-4::/usr/local/cuda-12.6/gds/tools$ ./gdscheck.py -v
warn: error opening log file: Permission denied, logging will be disabled
GDS release version: 1.11.1.6
nvidia_fs not loaded, operating on compatible mode. libcufile version: 2.12
Platform: x86_64
nvidia_fs module is not loaded. I install it using:
sudo apt install nvidia-fs
But it seems that it is not a related new version of nvidia_fs. Could someone also tell me how can I download related new version of nvidia_fs?o_o