Separake: Source Separation with a Little Help From Echoes
This work addresses audio source separation for applications in noisy or reverberant environments, offering a novel approach that is incremental but provides specific gains.
The paper tackles the problem of sound source separation by leveraging multipath echoes instead of ignoring them, showing that using known positions of a few virtual microphones from echoes improves performance over anechoic methods, with concrete improvements in multichannel non-negative matrix factorization and enabling separation where it was previously impossible with magnitude-only information.
It is commonly believed that multipath hurts various audio processing algorithms. At odds with this belief, we show that multipath in fact helps sound source separation, even with very simple propagation models. Unlike most existing methods, we neither ignore the room impulse responses, nor we attempt to estimate them fully. We rather assume that we know the positions of a few virtual microphones generated by echoes and we show how this gives us enough spatial diversity to get a performance boost over the anechoic case. We show improvements for two standard algorithms---one that uses only magnitudes of the transfer functions, and one that also uses the phases. Concretely, we show that multichannel non-negative matrix factorization aided with a small number of echoes beats the vanilla variant of the same algorithm, and that with magnitude information only, echoes enable separation where it was previously impossible.