2024-05-02T17:06:59.807336Z INFO download: text_generation_launcher: Successfully downloaded weights.
2024-05-02T17:06:59.807762Z INFO shard-manager: text_generation_launcher: Starting shard rank=0
2024-05-02T17:07:02.879107Z WARN text_generation_launcher: Could not import Flash Attention enabled models: CUDA is not available
2024-05-02T17:07:03.812736Z ERROR shard-manager: text_generation_launcher: Shard complete standard error output:
The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
Traceback (most recent call last):
File “/opt/conda/bin/text-generation-server”, line 8, in
sys.exit(app())
File “/opt/conda/lib/python3.10/site-packages/text_generation_server/cli.py”, line 71, in serve
from text_generation_server import server
File “/opt/conda/lib/python3.10/site-packages/text_generation_server/server.py”, line 17, in
from text_generation_server.models.vlm_causal_lm import VlmCausalLMBatch
File “/opt/conda/lib/python3.10/site-packages/text_generation_server/models/vlm_causal_lm.py”, line 14, in
from text_generation_server.models.flash_mistral import (
File “/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_mistral.py”, line 18, in
from text_generation_server.models.custom_modeling.flash_mistral_modeling import (
File “/opt/conda/lib/python3.10/site-packages/text_generation_server/models/custom_modeling/flash_mistral_modeling.py”, line 29, in
from text_generation_server.utils import paged_attention, flash_attn
File “/opt/conda/lib/python3.10/site-packages/text_generation_server/utils/flash_attn.py”, line 24, in
raise ImportError(“CUDA is not available”)
ImportError: CUDA is not available
rank=0
2024-05-02T17:07:03.910103Z ERROR text_generation_launcher: Shard 0 failed to start
2024-05-02T17:07:03.910129Z INFO text_generation_launcher: Shutting down shards
Error: ShardCannotStart
NVIDIA-SMI 470.239.06 Driver Version: 470.239.06 CUDA Version: 11.4
cuda is Installed nvidia-smi , nvcc –version matches bitsnbytes verion is also comaptible
pip install bitsandbytes-cuda114
Any suggestions appreciated?
RISHI_STACK_O is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.