How to configure llama-cpp-python to use more vCPUs for running LLM
I am using llama-cpp-python to run Mistral-7B-Instruct-v0.3-GGUF on an Azure Virtual Machine.
I am using llama-cpp-python to run Mistral-7B-Instruct-v0.3-GGUF on an Azure Virtual Machine.