ASSDFeb 3, 2020

Time Difference of Arrival Estimation from Frequency-Sliding Generalized Cross-Correlations Using Convolutional Neural Networks

arXiv:2002.00641v131 citations
AI Analysis

This is an incremental improvement for signal processing applications, enhancing a traditional method with deep learning.

The paper tackled time delay estimation in adverse acoustic conditions by applying convolutional neural networks to frequency-sliding generalized cross-correlations, achieving excellent performance and generalization across different setups.

The interest in deep learning methods for solving traditional signal processing tasks has been steadily growing in the last years. Time delay estimation (TDE) in adverse scenarios is a challenging problem, where classical approaches based on generalized cross-correlations (GCCs) have been widely used for decades. Recently, the frequency-sliding GCC (FS-GCC) was proposed as a novel technique for TDE based on a sub-band analysis of the cross-power spectrum phase, providing a structured two-dimensional representation of the time delay information contained across different frequency bands. Inspired by deep-learning-based image denoising solutions, we propose in this paper the use of convolutional neural networks (CNNs) to learn the time-delay patterns contained in FS-GCCs extracted in adverse acoustic conditions. Our experiments confirm that the proposed approach provides excellent TDE performance while being able to generalize to different room and sensor setups.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes