SD LG NE ASApr 22, 2024

LVNS-RAVE: Diversified audio generation with RAVE and Latent Vector Novelty Search

Jinyue Guo, Anna-Maria Christodoulou, Balint Laczko, Kyrre Glette

arXiv:2404.14063v14.92 citationsh-index: 25Has CodeGECCO Companion

Originality Incremental advance

AI Analysis

This provides a creative tool for sound artists and musicians, though it is incremental as it integrates existing methods.

The paper tackles the problem of generating realistic and novel audio by combining Evolutionary Algorithms and Generative Deep Learning, using RAVE as a generator and VGGish for novelty evaluation in LVNS, resulting in diversified audio samples under various mutation setups.

Evolutionary Algorithms and Generative Deep Learning have been two of the most powerful tools for sound generation tasks. However, they have limitations: Evolutionary Algorithms require complicated designs, posing challenges in control and achieving realistic sound generation. Generative Deep Learning models often copy from the dataset and lack creativity. In this paper, we propose LVNS-RAVE, a method to combine Evolutionary Algorithms and Generative Deep Learning to produce realistic and novel sounds. We use the RAVE model as the sound generator and the VGGish model as a novelty evaluator in the Latent Vector Novelty Search (LVNS) algorithm. The reported experiments show that the method can successfully generate diversified, novel audio samples under different mutation setups using different pre-trained RAVE models. The characteristics of the generation process can be easily controlled with the mutation parameters. The proposed algorithm can be a creative tool for sound artists and musicians.

View on arXiv PDF Code

Similar