Deep Audio Waveform Prior
This work addresses audio restoration for applications like denoising and inpainting, but it is incremental as it extends prior image-based deep prior concepts to audio waveforms.
The paper tackled the problem of unsupervised audio restoration by showing that state-of-the-art architectures for audio source separation contain deep priors when working with raw waveforms, enabling restoration from corruptions like background noise, reverberations, and gaps without explicit training on clean data.
Convolutional neural networks contain strong priors for generating natural looking images [1]. These priors enable image denoising, super resolution, and inpainting in an unsupervised manner. Previous attempts to demonstrate similar ideas in audio, namely deep audio priors, (i) use hand picked architectures such as harmonic convolutions, (ii) only work with spectrogram input, and (iii) have been used mostly for eliminating Gaussian noise [2]. In this work we show that existing SOTA architectures for audio source separation contain deep priors even when working with the raw waveform. Deep priors can be discovered by training a neural network to generate a single corrupted signal when given white noise as input. A network with relevant deep priors is likely to generate a cleaner version of the signal before converging on the corrupted signal. We demonstrate this restoration effect with several corruptions: background noise, reverberations, and a gap in the signal (audio inpainting).