Sinusoidal wave generating network based on adversarial learning and its application: synthesizing frog sounds for data augmentation
This work addresses data scarcity in signal processing, particularly for amphibian sound classification, by providing an efficient method for synthetic data generation, but it is incremental as it applies existing adversarial learning techniques to a specific domain.
The paper tackled the problem of generating realistic synthetic signals for data augmentation by proposing a generative model based on adversarial learning, specifically designing sinusoidal wave generating networks for audio waveforms. The results showed that the model produces realistic signals, and in amphibian sound classification, training with synthetic sounds improved performance, though no concrete numbers were provided.
Simulators that generate observations based on theoretical models can be important tools for development, prediction, and assessment of signal processing algorithms. In order to design these simulators, painstaking effort is required to construct mathematical models according to their application. Complex models are sometimes necessary to represent a variety of real phenomena. In contrast, obtaining synthetic observations from generative models developed from real observations often require much less effort. This paper proposes a generative model based on adversarial learning. Given that observations are typically signals composed of a linear combination of sinusoidal waves and random noises, sinusoidal wave generating networks are first designed based on an adversarial network. Audio waveform generation can then be performed using the proposed network. Several approaches to designing the objective function of the proposed network using adversarial learning are investigated experimentally. In addition, amphibian sound classification is performed using a convolutional neural network trained with real and synthetic sounds. Both qualitative and quantitative results show that the proposed generative model makes realistic signals and is very helpful for data augmentation and data analysis.