I am running Ollma on a 4xA100 GPU server, but it looks like only 1 GPU is used for the LLaMa3:7b
model.
How can I use all 4 GPUs simultaneously?
I am not using a docker
, just use ollama serve
and ollama run
.
Or is there a way to 4 servers simultaneously with different ports for a large size batch process?
Wed May 15 01:24:29 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.129.03 Driver Version: 535.129.03 CUDA Version: 12.2 |
|-----------------------------------------+----------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+======================+======================|
| 0 NVIDIA A100 80GB PCIe On | 00000000:17:00.0 Off | 0 |
| N/A 63C P0 293W / 300W | 39269MiB / 81920MiB | 88% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 1 NVIDIA A100 80GB PCIe On | 00000000:65:00.0 Off | 0 |
| N/A 28C P0 51W / 300W | 7MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 2 NVIDIA A100 80GB PCIe On | 00000000:CA:00.0 Off | 0 |
| N/A 28C P0 51W / 300W | 7MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
| 3 NVIDIA A100 80GB PCIe On | 00000000:E3:00.0 Off | 0 |
| N/A 29C P0 52W / 300W | 7MiB / 81920MiB | 0% Default |
| | | Disabled |
+-----------------------------------------+----------------------+----------------------+
+---------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=======================================================================================|
| 0 N/A N/A 3420401 C ...unners/cuda_v11/ollama_llama_server 39256MiB |
+---------------------------------------------------------------------------------------+