I have a MediaPipe hand-tracking task running successfully on OpenCV frames.
Full code: https://pastebin.com/2wubwxgg
Here’s the relevant code:
cap = cv.VideoCapture(0)
if not cap.isOpened():
print("Cannot open camera")
exit()
while True:
# Capture frame-by-frame
ret, frame = cap.read()
# if frame is read correctly ret is True
if not ret:
print("Can't receive frame (stream end?). Exiting ...")
break
# Our operations on the frame come here
# Display the resulting frame
frame_timestamp_ms = int(cap.get(cv.CAP_PROP_POS_MSEC))
mp_image = mp.Image(
image_format=mp.ImageFormat.SRGB, data=np.ascontiguousarray(frame)
)
detection_result = landmarker.detect_async(mp_image, frame_timestamp_ms)
if detection_result is not None:
annotated_image = draw_landmarks_on_image(
mp_image.numpy_view(), detection_result
)
annotated_image_bgr = cv.cvtColor(annotated_image, cv.COLOR_RGB2BGR)
cv.imshow("Hand Tracking", annotated_image_bgr)
if cv.waitKey(1) == ord("q"):
break
cap.release()
cv.destroyAllWindows()
However, OpenCV is not displaying the frame.
I tested just the display code, excluding the frame annotation, and it seems to work with non-annotated frames. I also tested not converting my video frames to RGB, but to no avail. Any configuration I attempt results in the frames not being displayed, but the task correctly annotating the image frames being passed in.
Everything else works, and I can see the task outputting tracking data within the terminal. My camera also indicates it is on.
Example output of landmark readings:
No hand visible:
hand landmarker result: HandLandmarkerResult(handedness=[], hand_landmarks=[], hand_world_landmarks=[])
With hand visible:
hand landmarker result: HandLandmarkerResult(handedness=[[Category(index=1, score=0.9681164026260376, display_name='Left', category_name='Left')]], hand_landmarks=[[NormalizedLandmark(x=0.8470160365104675, y=0.9493416547775269, z=6.204702458489919e-07, visibility=0.0, presence=0.0), NormalizedLandmark(x=0.7390292286872864, y=0.934730052947998, z=-0.03473040834069252, visibility=0.0, presence=0.0), NormalizedLandmark(x=0.6428059935569763, y=0.8524196147918701, z=-0.048505205661058426,
I cannot find an existing example of this, nor does the OpenCV documentation help.
Maybe I’m missing something obvious.