Speech Separation Using Partially Asynchronous Microphone Arrays Without Resampling
This addresses the problem of separating speech sources in distributed arrays with sampling rate mismatches, particularly when sources or microphones are moving, offering a practical solution for applications like mobile audio capture.
The paper tackled speech separation with partially asynchronous microphone arrays by proposing a method that avoids offset estimation and resampling, instead using synchronous subarrays to estimate time-varying statistics and design spatial filters, achieving successful separation in both stationary and moving array scenarios.
We consider the problem of separating speech sources captured by multiple spatially separated devices, each of which has multiple microphones and samples its signals at a slightly different rate. Most asynchronous array processing methods rely on sample rate offset estimation and resampling, but these offsets can be difficult to estimate if the sources or microphones are moving. We propose a source separation method that does not require offset estimation or signal resampling. Instead, we divide the distributed array into several synchronous subarrays. All arrays are used jointly to estimate the time-varying signal statistics, and those statistics are used to design separate time-varying spatial filters in each array. We demonstrate the method for speech mixtures recorded on both stationary and moving microphone arrays.