I’m trying to do a system where I have:
- a detector based on YOLOv8 and trained on my custom dataset
- a classifier that can be for example EfficentNet
I have already trained these two and am now testing the system on a video to see the results.
The detector is been trained with:
model = YOLO(model_path)
results_m = model.train(data= data, single_cls=False, imgsz=640, epochs = 25)
The problem I am encountering is that the detection phase seems to fail on static objects. Specifically, my dataset has 3 classes: plastic bags, cardboard boxes, and an “other” class. If a plastic bag is carried by a person, the system detects it, but if the plastic bag is dumped and then stops on the ground, the system no longer detects it. Why does this happen?
My code is the following:
from tensorflow.keras.models import load_model
import cv2
from ultralytics import YOLO
classifier_class_names = {
0: 'Cardboard_box',
1: 'Other',
2: 'Plastic_bag',
}
detector = YOLO("/myPath/best.pt")
classifier = load_model("/myPath/efficientnet_model_unfreeze_128.h5")
video_path = "/myPath/video_test.mp4"
cap = cv2.VideoCapture(video_path)
output_path = "/myPath/video_out.mp4"
fourcc = cv2.VideoWriter_fourcc(*'mp4v')
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
out = cv2.VideoWriter(output_path, fourcc, fps, (width, height))
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
results = detector(frame)
detections = results[0].boxes #list of boxes
for box in detections:
x1, y1, x2, y2 = box.xyxy[0].tolist()
conf = box.conf[0].item()
cls = results[0].names[box.cls[0].item()]
# Extract ROI
roi = frame[int(y1):int(y2), int(x1):int(x2)]
# Preprocess ROI
roi_resized = cv2.resize(roi, (300, 300))
roi_resized = roi_resized / 255.0
roi_resized = roi_resized.reshape(1, 300, 300, 3)
# Classify ROI
pred = classifier.predict(roi_resized)
class_id = pred.argmax(axis=1)[0]
class_name = classifier_class_names.get(class_id, 'Other')
label = f'Class: {class_name}, Conf: {conf:.2f}'
cv2.rectangle(frame, (int(x1), int(y1)), (int(x2), int(y2)), (0, 255, 0), 2)
cv2.putText(frame, label, (int(x1), int(y1) - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (0, 255, 0), 2)
out.write(frame)
cap.release()
out.release()
cv2.destroyAllWindows()
New contributor
Sabato Fasulo is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.