Sunday, November 20, 2011

Challenges with Motion Recognition / Motion-Interaction Mapping

After creating the motion recognition framework for a 3d object editor-like interaction in unity (zoom, panning and rotation) , I have come upon a few interesting choices in approach.

For some brief background, I currently break down all user motions into discrete actions. Each action contains information such as velocity, acceleration, start/end position and duration. I currently have programming two approaches into interpreting this data.

The first is a live approach. The live approach does not wait for an action to end. Rather it checks the current action that is being track (if any). If the current action satisfies certain rules for an interaction (ex. specific start position, duration, must occur in unison with another action), then we start changing the state of the interface based on each new frame update. Here is a quick example of this. In order for a correct rotation gesture, both hands must be above the waist, and the actions of the two hands must start in the roughly the same time. If both initial conditions are true, then I start changing the orientation of my object based on the updated positions of the two hands until the action is complete.

In the second approach, rather than looking at initial conditions of the current action. We look through the list of completed actions. Then we analyze the saved data of the those actions to see if the sequence of actions satisfy any possible interactions.

There are plus and minuses to both approaches. In the live approach, the interaction is very responsive and there is barely any downtime between gesture and a change in the interface; however we also sacrifice accuracy. This approach is dependent on the fact that our read of the initial state is correct, we must make an assumption that if the user begins a specific motion, he will also end it correctly.

The second approach is more accurate. We can look at the entire sequence of actions to ensure that we will match the correct on screen changes. However, there is lag time. What if the user attempts one long motion--we will not be able to process this motion until it is complete, and thus the user will be unable to see any changes in the interface for a relatively long time.

Any thoughts?


No comments:

Post a Comment