vLLM + FastAPI async streaming response – fastapi can’t handle vllm speed and bottlenecks I have a chatbot web app with the following components :