Homayoun Kamkar-Parsi

64.6ASApr 29

Multi-Speaker DOA Estimation in Binaural Hearing Aids using Deep Learning and Speaker Count Fusion

Farnaz Jazaeri, Homayoun Kamkar-Parsi, François Grondin et al.

For extracting a target speaker voice, direction-of-arrival (DOA) estimation is crucial for binaural hearing aids operating in noisy, multi-speaker environments. Among the solutions developed for this task, a deep learning convolutional recurrent neural network (CRNN) model leveraging spectral phase differences and magnitude ratios between microphone signals is a popular option. In this paper, we explore adding source-count information for multi-sources DOA estimation. The use of dual-task training with joint multi-sources DOA estimation and source counting is first considered. We then consider using the source count as an auxiliary feature in a standalone DOA estimation system, where the number of active sources (0, 1, or 2+) is integrated into the CRNN architecture through early, mid, and late fusion strategies. Experiments using real binaural recordings are performed. Results show that the dual-task training does not improve DOA estimation performance, although it benefits source-count prediction. However, a ground-truth (oracle) source count used as an auxiliary feature significantly enhances standalone DOA estimation performance, with late fusion yielding up to 14% higher average F1-scores over the baseline CRNN. This highlights the potential of using source-count estimation for robust DOA estimation in binaural hearing aids.

ASNov 3, 2018

A Robust Target Linearly Constrained Minimum Variance Beamformer With Spatial Cues Preservation for Binaural Hearing Aids

Hala As'ad, Martin Bouchard, Homayoun Kamkar-Parsi

In this paper, a binaural beamforming algorithm for hearing aid applications is introduced.The beamforming algorithm is designed to be robust to some error in the estimate of the target speaker direction. The algorithm has two main components: a robust target linearly constrained minimum variance (TLCMV) algorithm based on imposing two constraints around the estimated direction of the target signal, and a post-processor to help with the preservation of binaural cues. The robust TLCMV provides a good level of noise reduction and low level of target distortion under realistic conditions. The post-processor enhances the beamformer abilities to preserve the binaural cues for both diffuse-like background noise and directional interferers (competing speakers), while keeping a good level of noise reduction. The introduced algorithm does not require knowledge or estimation of the directional interferers' directions nor the second-order statistics of noise-only components. The introduced algorithm requires an estimate of the target speaker direction, but it is designed to be robust to some deviation from the estimated direction. Compared with recently proposed state-of-the-art methods, comprehensive evaluations are performed under complex realistic acoustic scenarios generated in both anechoic and mildly reverberant environments, considering a mismatch between estimated and true sources direction of arrival. Mismatch between the anechoic propagation models used for the design of the beamformers and the mildly reverberant propagation models used to generate the simulated directional signals is also considered. The results illustrate the robustness of the proposed algorithm to such mismatches.

Homayoun Kamkar-Parsi

2 Papers