Relative Content

Tag Archive for pythonserverlarge-language-modelray

How to implement ray server with multiple gpus?

I’m trying to implement a multi-gpu local server with ray and vllm. I have uploaded my full code and commands to this github repository. In short, I want to serve a big model that requires 2 gpus, but it can only use 1. I have made sure that my cuda env is in good shape, and that both gpus are detectable by torch. Tahnks in advance for any help.

Thiết kế website giá rẻ

Danh mục

Relative Content

Tag Archive for pythonserverlarge-language-modelray

How to implement ray server with multiple gpus?