SDCVASOct 18, 2022

BirdSoundsDenoising: Deep Visual Audio Denoising for Bird Sounds

arXiv:2210.10196v123 citationsh-index: 13
Originality Incremental advance
AI Analysis

This addresses denoising for bird sound analysis, with potential generalization to other audio tasks, but it is incremental as it adapts existing image segmentation techniques to audio.

The paper tackles audio denoising for bird sounds by transferring the problem into image segmentation, using a deep visual audio denoising model on a large-scale natural noise dataset, achieving state-of-the-art performance.

Audio denoising has been explored for decades using both traditional and deep learning-based methods. However, these methods are still limited to either manually added artificial noise or lower denoised audio quality. To overcome these challenges, we collect a large-scale natural noise bird sound dataset. We are the first to transfer the audio denoising problem into an image segmentation problem and propose a deep visual audio denoising (DVAD) model. With a total of 14,120 audio images, we develop an audio ImageMask tool and propose to use a few-shot generalization strategy to label these images. Extensive experimental results demonstrate that the proposed model achieves state-of-the-art performance. We also show that our method can be easily generalized to speech denoising, audio separation, audio enhancement, and noise estimation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes