Over the last year, I’ve been working on the development of an interactive audio spatialiser designed to move a vocal sound during musical performance. I tried to solve this problem by empowering the user to position the sound within an acoustic space through orienting the arm in that position (read more about it). It was just a starting point towards the design and development of a interactive audio spatialiser to move and interact with sound, here called a sound object, through common human physical behaviour. Contrary to P. Schaeffer’s sound object definition, for me, the analogy to the sound producing object is particularly important, due to the fact that the nature of the sound and the most common human interaction and behaviour with the sound producing object informs the sound interaction design.

My current objective is to allow users to grab a sound object, to move it around, to drop or throw it away within a virtual acoustic space. The grab-and-throw metaphor within a 3D virtual environment is something which has been explored  by Mine, M.R., 1995Robinett W. and Holloway R., 1992 and I was particularly inspired by the select-grab-manipulate-release process, which is described in Mine, M.R. et al. (1997) and  Steinicke and Hinrichs, 2006 and also applied in the DigiTrans project, to realise a system to grab, move, and drop or throw sound objects away. To achieve my objective, I proceeded in the following way:

First, I established the target hand poses for the grabbing, throwing and dropping gesture. The hand pose for the grabbing gesture is a fist, and the hand pose for throwing and dropping is a spread finger gesture.

Grabbing hand pose.
Dropping throwing hand pose.

These hand poses have been chosen for two main reasons. The most significant one, is because these are the most likely poses which our hand would assume when we grab,throw or drop  an object, whit one hand only. This makes the system easily discoverable and efficiently learnable (Vatavu R. and Zaiti I., 2013; Omata M. et al., 2000). The second reason is related to the technology I use for my research project to map gestures, which is the Myo armband. The Myo armband is perfectly able to track these two hand poses, but as you can imagine, knowing the hand pose only is not enough to recognise every nuance of a grab, drop or thrown gesture when trying to replicate the auditory feedback of a grabbed and then dropped or thrown sound object, taking into account its main features such as: trajectory speed, direction and gravitational force influence. Thus, I created a model for aSupport Vector Machine machine learning system fed with the EMGs’ mean absolute value, orientation and acceleration’s deviation of the arm within the 3D space.

Once I was able to track all gestures properly, I developed the audio part of the system. It consists of a stereo spatialiser which draws trajectory within the auditory scenery. Trajectories are established by mapping the machine learning output, orientation and acceleration into the envelope properties (attack, decay, sustain and release) of the sound object for each the output audio channels.

Envelope properties


The video below shows the trowing bit only, I will update it once I get the chance to realise a new video 😉