Relative Content

Tag Archive for pytorchonnxonnxruntime

Performance Discrepancy Between PyTorch and ONNX Model Conversion: Unexpected Increase in Latency

I have trained a binary image classifier using the Timm library and the ConvNeXt Base model as a pretrained model. When converting the model to ONNX format to optimize memory usage and response time, I observed a significant increase in latency. Specifically, for a batch of 100 images, the response time using the PyTorch .pth model was 7.6024 seconds, whereas the ONNX model took 74.3477 seconds under the same conditions. I am concerned about this discrepancy and would like to understand if this level of performance degradation is typical, as it was not anticipated.
This is the code I used to export the onnx model: