I’m encountering a problem when trying to use inputs_embeds to pass the embedding to my model:
ValueError: You passed
inputs_embedsto
.generate(), but the model class LlamaForCausalLM doesn't have its forwarding implemented. See the GPT2 implementation for an example (https://github.com/huggingface/transformers/pull/21405), and feel free to open a PR with it!
Can someone help me understand what is going on and how to fix this issue?
I have an embedding with the shape torch.Size([1, 46, 4096]) and I want to pass it to my model. Here is my code:
if True: from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained( model_name = "/content/drive/My Drive/finetuneunslothllama", max_seq_length = max_seq_length,
your text`
dtype = dtype,
load_in_4bit = load_in_4bit,
)
FastLanguageModel.for_inference(model) # Activer native 2x faster inference
outputs = model.generate(inputs_embeds=embeddings, max_new_tokens=64, use_cache=True)
generated_text = tokenizer.batch_decode(outputs, skip_special_tokens=True)
print(generated_text)`