SDLGASMar 10, 2019

Deep Griffin-Lim Iteration

arXiv:1903.03971v158 citations
Originality Incremental advance
AI Analysis

This work addresses phase retrieval for speech processing applications, presenting an incremental improvement over existing methods.

The paper tackles the problem of phase reconstruction from amplitude spectrograms in speech signals, proposing a hybrid method that combines Griffin-Lim algorithm layers with deep neural networks, resulting in adjustable performance and computational trade-offs.

This paper presents a novel phase reconstruction method (only from a given amplitude spectrogram) by combining a signal-processing-based approach and a deep neural network (DNN). To retrieve a time-domain signal from its amplitude spectrogram, the corresponding phase is required. One of the popular phase reconstruction methods is the Griffin-Lim algorithm (GLA), which is based on the redundancy of the short-time Fourier transform. However, GLA often involves many iterations and produces low-quality signals owing to the lack of prior knowledge of the target signal. In order to address these issues, in this study, we propose an architecture which stacks a sub-block including two GLA-inspired fixed layers and a DNN. The number of stacked sub-blocks is adjustable, and we can trade the performance and computational load based on requirements of applications. The effectiveness of the proposed method is investigated by reconstructing phases from amplitude spectrograms of speeches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes