Ollama fails to utilize GPU after driver update (NVIDIA)

As we're working - just like everyone else :-) - with AI tooling, we're using ollama host host our LLMs. Updating to the recent NVIDIA drivers (555.85), we can see that ollama is no longer using our GPU.

Testing the GPU mapping to the container shows the GPU is still there:

docker run -it --gpus=all --rm nvidia/cuda:12.4.1-base-ubuntu20.04 nvidia-smi
Thu May 23 15:17:44 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 555.42.03              Driver Version: 555.85         CUDA Version: 12.5     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4090        On  |   00000000:01:00.0 Off |                  Off |
|  0%   38C    P8             14W /  450W |       0MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

GPU availability check

Long story short: After all, the reason seems to be an issue between NVIDIA driver 555.85 and ollama. Downgrade the driver (for example to 552.44) and all is fine again :-)

Here's the GH issue: https://github.com/ollama/ollama/issues/4563