On a desktop pc, Fedora39, new kernel 6.9.7 and an NVIDIA GTX-650 gpu, I have done an update which included a kernel update.
Now, I am updating the NVIDIA driver (from 470.239.06 to 470.256.02, as found in http://www.nvidia.com/object/unix.html) manually by switching to runlevel 3 and then sh NVIDIA-Linux-x86_64-470.256.02.run
.
This procedure works ok, and doing the following confirms it:
cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 470.256.02 Thu May 2 14:37:44 UTC 2024
GCC version: gcc version 13.3.1 20240522 (Red Hat 13.3.1-1) (GCC)
cat /sys/module/nvidia/version
470.256.02
dkms status
nvidia/470.256.02, 6.9.7-100.fc39.x86_64, x86_64: installed
I have also checked the date on /sys/module/nvidia/version
which is 11:41, the current date when updating the driver.
All point to a successful installation of the driver and doing a startx
gets me to graphics mode as expected. All well.
However, after a reboot, graphics mode is not reached, I am presented with a console login and dmesg
tells me:
NVRM: API mismatch: the client has the version 470.256.02, but this kernel module has the version 470.239.06
i.e. somewhere there are remains of the old driver (which before the reboot there were not!):
cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 470.239.06 Sat Feb 3 06:03:07UTC 2024
GCC version: gcc version 13.3.1 20240522 (Red Hat 13.3.1-1) (GCC)
cat /sys/module/nvidia/version
470.239.06
dkms status
nvidia/470.256.02, 6.9.7-100.fc39.x86_64, x86_64: installed
Only the dkms shows the correct version. The first two seem to have regressed back to the old version.
Also, the date of /sys/module/nvidia/version
is now 11:59 (the reboot time). (And likewise /proc/driver/nvidia/version
) and not 11:41 which had after the driver update and before the reboot.
Another fact is that when on the above failure I run the shell script (from the console) to load the old driver I get the typical question from the shell script that another driver seems to be installed, do i want to proceed, blahblah. Without answering anything, it enters graphics mode (because it is still in runlevel 5 and gdm
keeps checking if the X server can be started). So, it seems that running the shell script but without telling it to do anything it goes and fixes the versions in e.g. /sys/module/nvidia/version
to the new driver.
But this is not stable because it regresses back after another reboot.
The temporary solution is to install the old driver. All work ok with that and versions don’t change.
Does anyone know who and why changes the driver versions but leaves the dkms intact?