SD ASMay 6, 2021

Deficient Basis Estimation of Noise Spatial Covariance Matrix for Rank-Constrained Spatial Covariance Matrix Estimation Method in Blind Speech Extraction

Yuto Kondo, Yuki Kubo, Norihiro Takamune, Daichi Kitamura, Hiroshi Saruwatari

arXiv:2105.02491v12.3

Originality Incremental advance

AI Analysis

This work addresses an incremental improvement in blind speech extraction for audio processing applications, specifically enhancing noise modeling in scenarios with one directional target speech and diffuse noise.

The paper tackled the problem of estimating the deficient basis of the noise spatial covariance matrix in rank-constrained spatial covariance matrix estimation (RCSCME) for blind speech extraction, resulting in a method that outperforms conventional approaches under various noise conditions.

Rank-constrained spatial covariance matrix estimation (RCSCME) is a state-of-the-art blind speech extraction method applied to cases where one directional target speech and diffuse noise are mixed. In this paper, we proposed a new algorithmic extension of RCSCME. RCSCME complements a deficient one rank of the diffuse noise spatial covariance matrix, which cannot be estimated via preprocessing such as independent low-rank matrix analysis, and estimates the source model parameters simultaneously. In the conventional RCSCME, a direction of the deficient basis is fixed in advance and only the scale is estimated; however, the candidate of this deficient basis is not unique in general. In the proposed RCSCME model, the deficient basis itself can be accurately estimated as a vector variable by solving a vector optimization problem. Also, we derive new update rules based on the EM algorithm. We confirm that the proposed method outperforms conventional methods under several noise conditions.

View on arXiv PDF

Similar