I’m trying to use Ray + vLLM and face AttributeError ('VLLMDeployment' object has no attribute '_serve_asgi_lifespan'
). I would like to know how to solve this issue.
Steps
-
Download the Ray Docker Image (
2.35.0-py312-gpu
) and make a container from it.
https://docs.ray.io/en/latest/ray-overview/installation.html#launch-ray-in-docker -
Based on this page, run
pip install "ray[serve]" requests vllm
, create a filellm.py
and runserve run llm:build_app model="NousResearch/Meta-Llama-3-8B-Instruct" tensor-parallel-size=2
.
https://docs.ray.io/en/latest/serve/tutorials/vllm-example.html -
Part of the result of
server run...
is shown below.
(ServeReplica:default:VLLMDeployment pid=2839) ERROR 2024-09-16 01:40:43,815 default_VLLMDeployment imhgemig replica.py:1202 - Exception during graceful shutdown of replica: 'VLLMDeployment' object has no attribute '_serve_asgi_lifespan'
(ServeReplica:default:VLLMDeployment pid=2839) File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/serve/_private/replica.py", line 1196, in call_destructor
(ServeReplica:default:VLLMDeployment pid=2839) await self._call_func_or_gen(self._callable.__del__)
(ServeReplica:default:VLLMDeployment pid=2839) result = await result
(ServeReplica:default:VLLMDeployment pid=2839) File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/serve/api.py", line 225, in __del__
(ServeReplica:default:VLLMDeployment pid=2839) await ASGIAppReplicaWrapper.__del__(self)
(ServeReplica:default:VLLMDeployment pid=2839) File "/home/ray/anaconda3/lib/python3.12/site-packages/ray/serve/_private/http_util.py", line 472, in __del__
(ServeReplica:default:VLLMDeployment pid=2839) with LoggingContext(self._serve_asgi_lifespan.logger, level=logging.WARNING):
(ServeReplica:default:VLLMDeployment pid=2839) AttributeError: 'VLLMDeployment' object has no attribute '_serve_asgi_lifespan'
Current situation
I have just started studying Ray, so am not familiar with Ray or FastAPI. I haven’t found what '_serve_asgi_lifespan'
was either.