Description
As the title suggested, I’m trying to run a detection engine model with an attached BatchedNMSDynamic_TRT
plugin.
The results of the model are always 0.
I have tested the same engine file on Python and it worked flawlessly. But still, I am attaching the code I used to generate the engine file.
- Translating YOLOv5 output into the correct format mentioned here
class ProcModel(nn.Module):
def __init__(self, model, class_num):
super(ProcModel, self).__init__()
self.model = model
self.num_classes = class_num
def forward(self, x):
out = self.model(x)[0]
bbox_out = torch.unsqueeze(out[:,:,:4], 2)
x1 = bbox_out[:,:,:,0] - bbox_out[:,:,:,2] / 2
y1 = bbox_out[:,:,:,1] - bbox_out[:,:,:,3] / 2
x2 = bbox_out[:,:,:,0] + bbox_out[:,:,:,2] / 2
y2 = bbox_out[:,:,:,1] + bbox_out[:,:,:,3] / 2
bbox_out = torch.stack((x1, y1, x2, y2), dim=3)
conf_out = out[:,:,4:5]
class_out = out[:,:,5:] * conf_out
return [bbox_out, class_out]
- Converting to onnx:
torch.onnx.export(
model.cpu(),
im.cpu(),
f,
export_params=True,
verbose=False,
opset_version=opset,
do_constant_folding=True,
input_names=['images'],
output_names=output_names,
dynamic_axes=dynamic)
- Creating and attaching
BatchedNMSDynamic_TRT
node to ONNX graph.
def create_attrs(input_h, input_w, topK, keepTopK):
attrs = {}
attrs["shareLocation"] = 1
attrs["backgroundLabelId"] = -1
attrs["numClasses"] = 80
attrs["topK"] = topK
attrs["keepTopK"] = keepTopK
attrs["scoreThreshold"] = 0.25
attrs["iouThreshold"] = 0.6
attrs["isNormalized"] = False
attrs["clipBoxes"] = False
attrs["plugin_version"] = "1"
return attrs
def add_nmsplugin_to_onnx(model_file, output_names=('output0-bbox', 'output0-class'), topk=200, keepTopK=100):
graph = gs.import_onnx(onnx.load(model_file)) # load onnx model
batch_size = graph.inputs[0].shape[0]
input_h = graph.inputs[0].shape[2]
input_w = graph.inputs[0].shape[3]
tensors = graph.tensors()
boxes_tensor = tensors[output_names[0]] # match with onnx model output name
confs_tensor = tensors[output_names[1]] # match with onnx model output name
num_detections = gs.Variable(name="num_detections").to_variable(dtype=np.int32, shape=[batch_size, 1])
nmsed_boxes = gs.Variable(name="nmsed_boxes").to_variable(dtype=np.float32, shape=[batch_size, keepTopK, 4])
nmsed_scores = gs.Variable(name="nmsed_scores").to_variable(dtype=np.float32, shape=[batch_size, keepTopK])
nmsed_classes = gs.Variable(name="nmsed_classes").to_variable(dtype=np.float32, shape=[batch_size, keepTopK])
new_outputs = [num_detections, nmsed_boxes, nmsed_scores, nmsed_classes] # do not change
nms_node = gs.Node( # define nms plugin
op="BatchedNMSDynamic_TRT", # match with batchedNMSPlugn
attrs=create_attrs(input_h, input_w, topk, keepTopK), # set attributes for nms plugin
inputs=[boxes_tensor, confs_tensor],
outputs=new_outputs
)
graph.nodes.append(nms_node) # nms plugin added
graph.outputs = new_outputs
graph = graph.cleanup().toposort()
onnx.save(gs.export_onnx(graph), model_file) # save model
return model_file
I have tested my C++ code with other models and it worked fine as well.
So my conclusion is:
- I didn’t initialize the Plugin libraries correctly. Particularlly, I run
initLibNvInferPlugins(&m_logger, "");
before deserializing the engine file.
2. There is a bug in this tensorrt plugin.
Environment
TensorRT Version: 8.4.3.1
Installed with TensorRT-8.4.3.1.Linux.x86_64-gnu.cuda-11.6.cudnn8.4.tar.gz
GPU Type: RTX 3090
Nvidia Driver Version: 470.239.06
CUDA Version: 11.4.4
CUDNN Version: 8.2.2
Operating System + Version: Ubuntu 20.04
Python Version (if applicable): 3.10.9
PyTorch Version (if applicable): 2.0.0
Baremetal or Container (if container which image + tag): nvcr.io/nvidia/cuda:11.4.3-cudnn8-devel-ubuntu20.04
Relevant Files
My C++ code is similar to this: https://github.com/cyrusbehr/tensorrt-cpp-api
darkerlord149 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.