I have converted facebook/bart-large-mnli model to onnx using torch.onnx.export()
and trying to perform multi-label text classification using onnxruntime as below:
import onnxruntime as ort
from transformers import AutoTokenizer
import numpy as np
import torch
import torch.nn.functional as F
onnx_model_path = "models/bart-large-mnli.onnx"
tokenizer = AutoTokenizer.from_pretrained("facebook/bart-large-mnli")
ort_session = ort.InferenceSession(onnx_model_path)
sentence = "Tech world is set to embrace new innovations"
labels = ["technology", "environment", "entertainment", "sports"]
new_text = f"{sentence}</s></s>"
hypothesis_template = new_text + "{}"
inputs = tokenizer(
[sentence] * len(labels),
[hypothesis_template.format(label) for label in labels],
return_tensors="np",
padding=True,
truncation=True
)
inputs_onnx = {
'input_ids': inputs['input_ids'].astype(np.int64),
'attention_mask': inputs['attention_mask'].astype(np.int64)
}
logits = ort_session.run(None, inputs_onnx)[0]
probs = F.sigmoid(torch.tensor(logits))
for label, prob in zip(labels, probs):
scalar_prob = prob.item() if prob.numel() == 1 else prob[0].item()
print(f"Label: {label}, Probability: {scalar_prob}")
I am using sigmoid instead of softmax as I need individual probability for each label for a given text. But it’s not producing the correct probabilities. Also, I am not sure this is how BART combines text and labels for tokenization
text_to_tokenize = f"{sentence}</s></s>{label}"
I would appreciate any help. Please let me know if you need more info from my end.