Single Channel Audio Source Separation using Convolutional Denoising Autoencoders
This is an incremental improvement for audio processing applications like speech enhancement or music separation.
The paper tackled single-channel audio source separation by using convolutional denoising autoencoders (CDAEs) to separate sources by treating others as noise, resulting in slightly better performance than deep feedforward neural networks with fewer parameters.
Deep learning techniques have been used recently to tackle the audio source separation problem. In this work, we propose to use deep fully convolutional denoising autoencoders (CDAEs) for monaural audio source separation. We use as many CDAEs as the number of sources to be separated from the mixed signal. Each CDAE is trained to separate one source and treats the other sources as background noise. The main idea is to allow each CDAE to learn suitable spectral-temporal filters and features to its corresponding source. Our experimental results show that CDAEs perform source separation slightly better than the deep feedforward neural networks (FNNs) even with fewer parameters than FNNs.