A Complex Matrix Factorization approach to Joint Modeling of Magnitude and Phase for Source Separation
This addresses the problem of improved speech source separation for audio processing applications, but it is incremental as it builds on existing NMF methods.
The paper tackled the problem of source separation by incorporating spectral phase into the decomposition process, which reduced undesired traces of interfering sources. The result was effective separation demonstrated through objective quality evaluations on the GRID corpus.
Conventional NMF methods for source separation factorize the matrix of spectral magnitudes. Spectral Phase is not included in the decomposition process of these methods. However, phase of the speech mixture is generally used in reconstructing the target speech signal. This results in undesired traces of interfering sources in the target signal. In this paper the spectral phase is incorporated in the decomposition process itself. Additionally, the complex matrix factorization problem is reduced to an NMF problem using simple transformations. This results in effective separation of speech mixtures since both magnitude and phase are utilized jointly in the separation process. Improvement in source separation results are demonstrated using objective quality evaluations on the GRID corpus.