i am trying to export teknium/OpenHermes-2.5-Mistral-7B to ONNX,
this is my code :
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import onnx
model_name = "teknium/OpenHermes-2.5-Mistral-7B"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Set the model to evaluation mode
model.eval()
# Prepare a dummy input for the export
dummy_input = tokenizer("This is a test input", return_tensors="pt").input_ids
# Export the model to ONNX
torch.onnx.export(
model,
dummy_input,
"openhermes_onnx.onnx",
input_names=["input_ids"],
output_names=["output"],
dynamic_axes={"input_ids": {0: "batch_size", 1: "sequence_length"},
"output": {0: "batch_size", 1: "sequence_length"}},
opset_version=14,
do_constant_folding=True
)
it has generated the file (openhermes_onnx.onnx), but with 294 other Files with name similar to:
(model.embed_tokens.weight),(model.layers.0.post_attention_layernorm.weight), (_model_layers.0_self_attn_rotary_emb_Constant_attr__value) and (onnx__MatMul_10296)
Note: i have tried to load the model :
openhermes_onnx.onnx, but when i execute this code :
onnx_model_path = "openhermes_onnx.onnx"
ort_session = ort.InferenceSession(onnx_model_path)
the kernel dies and have to restart the kernel