I am working on a project where I analyze fitness videos using PoseLandmarkerOptions to read body posture from each frame of a 10-minute video. I store the body posture data into a list, resulting in a shape of (3000, 33, 3). Each frame contains 33 points representing body parts (e.g., left/right arm, leg, head), and each point has x, y, z coordinates.
By plotting the deltas (movements) between points, plot1
I can visually identify when each exercise begins and ends. plot2
However, I need to programmatically segment this data into distinct exercises. The exercises do not start at frame 0, there can be rest periods between them, and their durations vary.
My initial idea was to compare the first 5 seconds to the next 5 seconds using Euclidean distance or Dynamic Time Warping (DTW), but this method does not segment the exercises accurately.plot3
Question:
What are some alternative methods or approaches to effectively segment the exercises in the video?
Any suggestions or examples of similar implementations would be greatly appreciated!