I am trying to load a model from path into llamaindex with the HuggingfaceLLM class like this:
from llama_index.llms.huggingface import HuggingFaceLLM
llm = HuggingFaceLLM(
context_window=2048,
max_new_tokens=300,
generate_kwargs={"temperature": 0.5, "do_sample": True},
#query_wrapper_prompt=query_wrapper_prompt,
tokenizer_name="local_path/leo-hessianai-7B-AWQ",
model_name="local_path/leo-hessianai-7B-AWQ",
device_map="auto"
)
The folders are downloaded from the huggingface-hub and the model is loading, however, when I query it, it returns only gibberish (like hohohohohohohohohohohohohoho and so on)
The source nodes are plausible and correct, I checked that, it is only the generating part that appears to be wrong.
Is there anything I am missing here? When I load the model from the hub with the link it’s fine, but that does not work in the IDE (and Ollama etc. are also not an option).
I appreciate any help, thanks!