Please excuse me for my late posting. I was only at Penn the last week for less than 48 hrs (Fall Break + a trek to EA's Redwood Shores HQ for my Wharton EA field application project). Expect a few post in the next few days to make up for it.
As previously mentioned, the Xbox Kinect SDK provides joint location positions (in the form of Vector3's).
Until now, I have been simply mapping relative velocity/position of joints to onscreen actions such as the zoom and pan of 2D objects. However, there are a few problems with this naive approach. First, the raw position points from the Kinect sensor are not perfectly stable. For example, if you hold your hand out straight without any movement, the onscreen object mapped to your hand's position will jitter based on sensor inaccuracies.
Additionally, we are not able to separate gestures: if an individual accelerates his hand to zoom and then stops, we should the onscreen zoom to reflect this behavior; however, this does not occur if we simply map zoom to hand position since the mapping does not stop, even after our action.
Thus for the next few days, I will add on filtering and motion segmentation features on top of my existing Kinect framework.
For filtering, my main objective is to reduce data jitter and smooth the raw position data such that motion mappings produce smooth onscreen actions. There are two aspects in filtering.
(1)The first is to reduce drift which is low frequency noise. Drift is often caused when our subject ever so slightly changes his overall position. The simplest way to reduce drift is to make the subject root position (which can be the shoulder position in the case of hand/arm motion gestures) the origin of the coordinate frame.
(2)The second component of our jitter comes from high frequency noise. This is a result of both minute motions of the subject and the slight inaccuracies of the Kinect camera sensors. The best way to fix this problem will be to pass the data through a smoothing filter. Luckily the Kinect SDK comes its built in smoothing function that is based off the Holt double exponential smoothing method.
I will also apply the Holt smoothing method to my velocity and acceleration calculation in the framework.
Finding the right the constants for optimal smoothing and finding a balance between accuracy/responsiveness and minimal jitter will be an ongoing process throughout my project.
My motion segmentation implementation will mainly revolve around the zero-crossings of specific joint velocities. The basic idea is that every abrupt change in our velocity will represent a new motion. More to come on this as I begin the implementation....
This is a problem I'm also running into. Nice solutions. And your project is great so far, David.
ReplyDeleteThanks Marley.
ReplyDeleteHere the equation that I am using for smoothing velocity and acceleration:
http://en.wikipedia.org/wiki/Exponential_smoothing#The_exponential_moving_average
Smoothing will be very important if you are using abrupt velocity changes for motion segmentation.
ReplyDelete- you may also want to check for velocity changes over a window of time, as opposed to instantaneous velocity changes.
Also, I assume your segmentation approach will break up input data into primitive motion "actions". Your system should then be able to detect sequences of these primitive actions and then map them to high-level commands.
Kind of like an action language (action-ary)