I am developing an application where I am taking dynamic gestures as input and then mapping them to keyboard controls. By dynamic gesture, I mean for example hand moving from left to right or hand moving right to left.
I have four gestures: hand moving bottom to top, top to bottom, left to right and right to left.
I am able to recognize gestures but the problem is how to decide the start and ending of the gesture on a continuous video input.
What is the most efficient and effective algorithm for detecting this on a video input?
2
One way of dealing with this problem is to have a finite state machine with at least three states:
- not detecting anything
- detection phase
- gesture dectected
Then you need to carefully design conditions for each state modification (ie going from detection phase back to “not detecting anything” in case of failure) an run them at each frame of your video stream.
You may need one state machine for each gesture you want to detect, and more than one “detection phase” (“hand at the top”, then “hand middle” …).
When you reach the “gesture detected” phase, you can fire events associated to this gesture.
The robustness of your gesture detection feature depends a lot on how you define your states (do you measure the position or the movement?) and how do you tune the state change parameters.
0