ASCLLGSDOct 21, 2020

BERT for Joint Multichannel Speech Dereverberation with Spatial-aware Tasks

arXiv:2010.10892v2
Originality Incremental advance
AI Analysis

This addresses speech enhancement for applications like hearing aids or communication systems, but it appears incremental as it adapts existing transformer methods to a specific domain.

The paper tackled the problem of joint multichannel speech dereverberation with direction-of-arrival estimation and speech separation by proposing a method based on BERT-inspired transformers for sequence-to-sequence mapping, and experimental results demonstrated its effectiveness.

We propose a method for joint multichannel speech dereverberation with two spatial-aware tasks: direction-of-arrival (DOA) estimation and speech separation. The proposed method addresses involved tasks as a sequence to sequence mapping problem, which is general enough for a variety of front-end speech enhancement tasks. The proposed method is inspired by the excellent sequence modeling capability of bidirectional encoder representation from transformers (BERT). Instead of utilizing explicit representations from pretraining in a self-supervised manner, we utilizes transformer encoded hidden representations in a supervised manner. Both multichannel spectral magnitude and spectral phase information of varying length utterances are encoded. Experimental result demonstrates the effectiveness of the proposed method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes