I have a Yolov8-n detection model that simply detects two classes with bounding boxes and confidences: cat and dog. When I visualize the model with Netron, the last array shows as 1x6x10710
as shown below. The raw prediction numpy array from the model is saved as a .bin file that I read:
array = np.fromfile(yolo_data_path)
The size of the .bin file is exacly 257,040. Since the output is float32 [1,6,10710]
, so the output size seems to be correct: 4x6x10710 = 257,040
enter image description here
Now given that I have this .bin output array, how can I read it and decode the bbox and print the final detections? Basically, the bounding boxes, class and their confidences. I assume that 6 is basically 4 bounding boxes, confidence and class number.
Looking at the Yolov8 predict.py, I understand somehow after parsing these preditions, we have to pass them through non_max_suppression()
to get the final results.