Ollama fails to utilize GPU after driver update (NVIDIA)
Ollama can't make use of NVIDIA GPUs when using latest drivers - fix is easy: Downgrade and wait for the next release. :-)
As we're working - just like everyone else :-) - with AI tooling, we're using ollama host host our LLMs. Updating to the recent NVIDIA drivers (555.85), we can see that ollama is no longer using our GPU.
Testing the GPU mapping to the container shows the GPU is still there:
docker run -it --gpus=all --rm nvidia/cuda:12.4.1-base-ubuntu20.04 nvidia-smi
Thu May 23 15:17:44 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.03 Driver Version: 555.85 CUDA Version: 12.5 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 4090 On | 00000000:01:00.0 Off | Off |
| 0% 38C P8 14W / 450W | 0MiB / 24564MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
Long story short: After all, the reason seems to be an issue between NVIDIA driver 555.85 and ollama. Downgrade the driver (for example to 552.44) and all is fine again :-)
Here's the GH issue: https://github.com/ollama/ollama/issues/4563