This repository contains all the code to reproduce the results of the paper Separake: Source separation with a little help from echoes.
We are available for any question or request relating to either the code or the theory behind it. Just ask!
It is commonly believed that multipath hurts various audio processing algorithms. At odds with this belief, we show that multipath in fact helps sound source separation, even with very simple propagation models. Unlike most existing methods, we neither ignore the room impulse responses, nor we attempt to estimate them fully. We rather assume that we know the positions of a few virtual microphones generated by echoes and we show how this gives us enough spatial diversity to get a performance boost over the anechoic case. We show improvements for two standard algorithms—one that uses only magnitudes of the transfer functions, and one that also uses the phases. Concretely, we show that multichannel non-negative matrix factorization aided with a small number of echoes beats the vanilla variant of the same algorithm, and that with magnitude information only, echoes enable separation where it was previously impossible.
- Robin Scheibler (TMU)
- Diego Di Carlo (INRIA)
- Antoine Deleforge (INRIA)
- Ivan Dokmanić (UIUC)
Robin Scheibler
Ono Laboratory
Graduate School of System Design
Tokyo Metropolitan University
6-6 Asahigaoka, Hino city, Tokyo
191-0065 Japan
separake_mu_early.py
uses the Ozerov and Fevotte MU algorithm. This is the orignal attempt by Robin.separake_near_wall.py
implements the image microphone model and places the microphones close to a wall. No separation yet.utilities.py
contains auxiliary methods.
To recreate the figures from the original simulated data (stored in data/paper_results/
), run
./make_figures.sh
To redo all the simulation, run
[TBA]
[TBA]
The recorded samples are stored in the recordings
folder.
Detailed description and instructions are provided along the data.
TBA
Authors of \cite{ozerov2010multichannel} generously provide a MATLAB implementation of MU-NMF and EM-NMF methods for stereo separation. We ported this code to Python 3 and extended it arbitrary number of input channels. We think this implementation could be useful to the community and have released the code\footnote{\textcolor{red}{}Link will go here after review}}.
First the original code was restricted to the 2-channel case, i.e.
Secondly, the MU-NMF was modified to handle sparsity contraint as described in \ref{sec:mu}.
Third, since EM method degenerates where
zero-valued entries are present in the dictionary matrix,
Finally, the code was further modified to deal with fixed dictionary and channel models matrices, which are normalized in order to avoid indeterminacy issues \cite{ozerov2010multichannel}.
Now to conclude with, no
\textit{simulated annealing} strategies are used in the final experiments.
In fact in some preliminary and informal investigations we noticed that this
yields better results than using annealing. In the experiments, the number
of iterations was set to
- A working distribution of Python 3.5 (but 2.7 should work too).
- Numpy, Scipy
- We use the distribution anaconda to simplify the setup of the environment.
- Computations are very heavy and we use the MKL extension of Anaconda to speed things up. There is a free license for academics.
- We used ipyparallel and joblib for parallel computations.
- matplotlib and seaborn for plotting the results.
- mir_eval is used for the BSS evaluation routines it contains.
The pyroomacoustics is used for STFT, fractionnal delay filters, microphone arrays generation, and some more.
pip install pyroomacoustics
List of standard packages needed
numpy, scipy, pandas, ipyparallel, seaborn, zmq, joblib, samplerate, mir_eval
TBA
Copyright (c) 2016, Antoine Deleforge, Diego Di Carlo, Ivan Dokmanić, Robin Scheibler
All the code in this repository is under MIT License.