I am using the YoloDotNet NuGet package to test the performance of YOLO models. I’m doing this testing for my degree thesis. However, I have encountered an issue where the GPU performance is significantly worse than the CPU performance.
Environment:
- YoloDotNet version: v2.0
- CPU: AMD ryzen 7 7800X3D
- GPU: 4070 super
- CUDA/cuDNN version: cuda 11.8 and cudnn 8.9.7
- .NET version: 8
Steps to reproduce:
var sw = new Stopwatch();
for (var i = 0; i < 500; i++)
{
var file = $@"C:UsersUtenteDocumentsassetsimagesinputframe_{i}.jpg";
using var image = SKImage.FromEncodedData(file);
sw.Restart();
var results = yolo.RunObjectDetection(image, confidence: 0.25, iou: 0.7);
sw.Stop();
image.Draw(results);
image.Save(file.Replace("input", $"output_{yolo_version}{version}_{target}").Replace(".jpg", $"_detect_{yolo_version}{version}_{target}.jpg"),
SKEncodedImageFormat.Jpeg);
times.Add(sw.Elapsed.TotalMilliseconds);
Console.WriteLine($"Time taken for image {i}: {sw.Elapsed.TotalMilliseconds:F2} ms");
This is the way I’m taking the time measure for the detections.
To load the model i use this setup in the GPU case
yolo = new Yolo(new YoloOptions
{
OnnxModel = @$"C:UsersUtenteDocumentsassetsmodelyolov{yolo_version}{version}_{target}.onnx",
ModelType = ModelType.ObjectDetection, // Model type
Cuda = true, // Use CPU or CUDA for GPU accelerated inference. Default = true
GpuId = 0, // Select Gpu by id. Default = 0
PrimeGpu = true, // Pre-allocate GPU before first. Default = false
});
Console.WriteLine(yolo.OnnxModel.ModelType);
Console.WriteLine($"Using GPU for version {yolo_version}{version}");
Performance Metrics using yolov8:
GPU Inference Time:
Total time taken for version m: 25693 ms
Average time per image for version m: 51.25 ms
CPU Inference Time:
Total time taken for version m: 34459.73 ms
Average time per image for version m: 69.74 ms
I would like to post graphs about the times but I do not have enough reputation
The issue presents its self for different sizes of the model. I have printed only the size m for ease of visualization.
Expected behavior is that the inference using the GPU should be faster than inference using the CPU.
But the performance is not improving using the GPU.
Eduard Brhas is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.