SDLGNEMar 19, 2015

Deep Transform: Time-Domain Audio Error Correction via Probabilistic Re-Synthesis

arXiv:1503.05849v11 citations
Originality Incremental advance
AI Analysis

This addresses error correction in audio communications devices, but appears incremental as it builds on existing neural network methods for speech processing.

The paper tackled the problem of correcting errors in time-domain audio signals by training a convolutional deep neural network to re-synthesize speech, demonstrating recovery from extreme degradation.

In the process of recording, storage and transmission of time-domain audio signals, errors may be introduced that are difficult to correct in an unsupervised way. Here, we train a convolutional deep neural network to re-synthesize input time-domain speech signals at its output layer. We then use this abstract transformation, which we call a deep transform (DT), to perform probabilistic re-synthesis on further speech (of the same speaker) which has been degraded. Using the convolutive DT, we demonstrate the recovery of speech audio that has been subject to extreme degradation. This approach may be useful for correction of errors in communications devices.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes