I am trying to export teknium/OpenHermes-2.5-Mistral-7B to ONNX,
This is my code:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import onnx
model_name = "teknium/OpenHermes-2.5-Mistral-7B"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Set the model to evaluation mode
model.eval()
# Prepare a dummy input for the export
dummy_input = tokenizer("This is a test input", return_tensors="pt").input_ids
# Export the model to ONNX
torch.onnx.export(
model,
dummy_input,
"openhermes_onnx.onnx",
input_names=["input_ids"],
output_names=["output"],
dynamic_axes={"input_ids": {0: "batch_size", 1: "sequence_length"},
"output": {0: "batch_size", 1: "sequence_length"}},
opset_version=14,
do_constant_folding=True
)
It has generated the file (openhermes_onnx.onnx), but with 294 other Files with name similar to:
(model.embed_tokens.weight),(model.layers.0.post_attention_layernorm.weight),
(_model_layers.0_self_attn_rotary_emb_Constant_attr__value) and
(onnx__MatMul_10296)
Note: I have tried to load the model:
openhermes_onnx.onnx, but when I execute this code:
onnx_model_path = "openhermes_onnx.onnx"
ort_session = ort.InferenceSession(onnx_model_path)
The kernel dies, and you have to restart the kernel.