Independent Vector Analysis with Deep Neural Network Source Priors
This work addresses speech separation for audio processing applications, representing an incremental improvement over existing methods.
The paper tackles the problem of speech separation using independent vector analysis by developing deep neural network priors to better capture speech structures, resulting in improved convergence speed for online implementation and higher signal-to-interference ratio for batch implementation.
This paper studies the density priors for independent vector analysis (IVA) with convolutive speech mixture separation as the exemplary application. Most existing source priors for IVA are too simplified to capture the fine structures of speeches. Here, we first time show that it is possible to efficiently estimate the derivative of speech density with universal approximators like deep neural networks (DNN) by optimizing certain proxy separation related performance indices. Experimental results suggest that the resultant neural network density priors consistently outperform previous ones in convergence speed for online implementation and signal-to-interference ratio (SIR) for batch implementation.