SOUNDCUBE

Concept

The project has two main objectives: the first is to use convolutional neural networks to classify audio samples according to a predefined set of categories, with a focus on environmental sounds. The second objective is to build a 3d immersive virtual environment where sounds are placed based on their similarity, allowing users to explore them and play them as he "walks" inside the space.

The first part of the project could be useful for sound designers: they could build their own dataset and obtain an automatic classification of new sounds they want to use. It could also be useful to listen to the environment and detect special events happening in the "soundscape" surrounding the user.

The second part of the project, on the other hand, can be especially useful for exhibitions, or for music performances. The way in which the 3d environment can be explored are many: for example, a composer could define a logic to walk inside the 3d space in a semi automated way, or devices like a Kinect or a Leap Motion could be used to move the focus on a particular zone in the environment.

Realization

We based our implementation on a paper by Karol J. Piczak, describing a suitable approach for what we wanted to do.

The Architecture of the approach proposed in the paper

A dataset of approximately 4000 sounds divided in 4 categories (wood, water, fans and voices) was built in order to have a training set for the machine learning step. The idea is to use the last layer of the convnet as feature vector for each sound, and to use t-SNE in order to reduce the dimenions of the feature vector to 3 and be able to plot each audio sample on our 3d environment.

Results

While the sound classification is still a work in progress, we managed to build the 3d environment using OpenFrameworks and using t-SNE directly with some spectral features describing each audio chunk. The current ways to exlore it are either by using a mouse and hovering on the spheres to listen to the associated sound, or to send a set of X,Y and Z coordinates via OSC to position the listener in the space: the 6 nearest sounds will be heard according to their position, using ambisonics.

Team

MaX Zanoni: massimiliano.zanoni@polimi.it
Jacopo Foglietti: fogliettijacopo@gmail.com
Luca Mucci: luca4cmp@gmail.com
Daniele Ciminieri: daniele@dotdotdot.it
Gabriele Balzano: gabry.balza@gmail.com
Sergio Missaglia: sergio.missaglia1@gmail.com
Alessandro Inguglia: alessandro@recipient.cc
Massimiliano Viel: info@massimilianoviel.net
Francesca Bonalume: bonalumefrancesca@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
code		code
images		images
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SOUNDCUBE

Concept

Realization

Results

Team

About

Releases

Packages

Contributors 3

Languages

opendot/ml4a-soundcube

Folders and files

Latest commit

History

Repository files navigation

SOUNDCUBE

Concept

Realization

Results

Team

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages