January 9, 2011

Kinect Theremin - Sneak Peek



Not surprisingly I've contracted Kinect fever :) From the moment I heard about the Kinect I knew it would be perfect for a Theremin simulator. Well, it turns out it's not perfect, but it is super fun.

How does the interface work?
When you step in front of the Kinect controller you see a contour map of yourself in gray. Once you position your hands before you they are detected and mapped to corresponding pitch (right hand z-axis), volume (left hand y-axis) and modulation (left hand x-axis). The color hue represents the pitch where C is red, D is orange, etc. The brightness of the colors represents the volume, so soft sounds are very faint where loud sounds are bright.

Describe the gear you're using?
Aside from a Kinect controller and a PC, no special hardware is needed, unlike the Wii Theremin.
I wrote the code in C++ for Windows using the C driver package written by Stijn Kuipers / "Zephod". The program uses only the 3D depth information to detect where the performer's hands are in space, then sends corresponding MIDI messages for pitch bend, volume and modulation. I use Maple Virtual MIDI Cable to route the messages to Absynth 5.0 software synthesizer by Native Instruments. Rather than sending distinct Note On / Note Off messages, the program sends a single Note On at startup and subsequently sends only Pitch Bend messages based on your hand position. Most software synthesizers don't support fluid multi-octave pitch bend, but Absynth handles this with grace.

So what's wrong with Kinect?
Don't get me wrong, the Kinect is a great device - it enables applications like this where the user doesn't need to hold anything or don special gloves to control the system. However, unlike the Wii Theremin, there are some limitations to the Kinect hardware that make it more difficult to play musically. First off, the Kinect controller has significant latency (time between the user's action and the delivery of the data). I haven't done any precise measurements but it's probably 200 to 300 milliseconds, which is really tricky for an instrument that depends so heavily on player feedback to target a specific pitch. Even if you can position your hands precisely by muscle memory you need to hit the note a fraction of a second before you want the note to sound, which makes even moderately fast musical passages difficult to play.

Secondly, the Kinect delivers data frames just 30 times per second so when playing a smooth glissando (slide) between two notes -- a common element in Theremin performance -- the gaps between pitches are fairly audible. The Wii Theremin, on the other hand, samples data 100 times per second so glissandos are very smooth.

When can I download and play with it?
Currently I have no idea when or if I'll share the code. I'm investigating product and performance opportunities so I'm not sharing the source at this point, sorry :(

Where have you been since creating the Wii Theremin two years ago?
Working on my other big work in progress.