I encountered an error when downloading a model from huggingface. It was working on Google Colab but not working on my windows machine. I am using Python 3.10.0
The error code is shown below
E:InternshipsConsciusAI.venvlibsite-packageshuggingface_hubfile_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
warnings.warn(
Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not used in <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>.
E:InternshipsConsciusAI.venvlibsite-packagestransformersquantizersauto.py:159: UserWarning: You passed `quantization_config` or equivalent parameters to `from_pretrained` but the model you're loading already has a `quantization_config` attribute. The `quantization_config` from the model will be used.
warnings.warn(warning_msg)
Traceback (most recent call last):
File "E:InternshipsConsciusAIemail_2.py", line 77, in <module>
main()
File "E:InternshipsConsciusAIemail_2.py", line 71, in main
summary = summarize_email(content)
File "E:InternshipsConsciusAIemail_2.py", line 22, in summarize_email
pipeline = transformers.pipeline(
File "E:InternshipsConsciusAI.venvlibsite-packagestransformerspipelines__init__.py", line 906, in pipeline
framework, model = infer_framework_load_model(
File "E:InternshipsConsciusAI.venvlibsite-packagestransformerspipelinesbase.py", line 283, in infer_framework_load_model
model = model_class.from_pretrained(model, **kwargs)
File "E:InternshipsConsciusAI.venvlibsite-packagestransformersmodelsautoauto_factory.py", line 563, in from_pretrained
return model_class.from_pretrained(
File "E:InternshipsConsciusAI.venvlibsite-packagestransformersmodeling_utils.py", line 3165, in from_pretrained
hf_quantizer.validate_environment(
File "E:InternshipsConsciusAI.venvlibsite-packagestransformersquantizersquantizer_bnb_4bit.py", line 62, in validate_environment
raise ImportError(
ImportError: Using `bitsandbytes` 8-bit quantization requires Accelerate: `pip install accelerate` and the latest version of bitsandbytes: `pip install -i https://pypi.org/simple/ bitsandbytes`
This is the code I used:
def summarize_email(content):
model_id = "unsloth/llama-3-8b-Instruct-bnb-4bit"
pipeline = transformers.pipeline(
"text-generation",
model=model_id,
model_kwargs={
"torch_dtype": torch.float16,
"quantization_config": {"load_in_4bit": True},
"low_cpu_mem_usage": True,
},
)
messages = [
{"role": "system", "content": "You are good at Summarizing"},
{"role": "user", "content": "Summarize the email for me " + content},
]
prompt = pipeline.tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
terminators = [
pipeline.tokenizer.eos_token_id,
pipeline.tokenizer.convert_tokens_to_ids("")
]
outputs = pipeline(
prompt,
max_new_tokens=256,
eos_token_id=terminators,
do_sample=True,
temperature=0.6,
top_p=0.9,
)
I am trying to summarize a text using “unsloth/llama-3-8b-Instruct-bnb-4bit” from huggingface.
It does summarize the text on google colab and kaggle but not on the local machine.
New contributor
Aswin Jimmy is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.