SDAIASJun 18, 2024

JEN-1 DreamStyler: Customized Musical Concept Learning via Pivotal Parameters Tuning

arXiv:2406.12292v13 citations
Originality Incremental advance
AI Analysis

This addresses the need for customized music generation for users who want to incorporate specific audio concepts, representing an incremental advance in text-to-music models.

The paper tackles the problem of generating music that captures specific concepts from reference audio, which text prompts alone cannot precisely convey, by proposing a method that fine-tunes a pretrained text-to-music model with a pivotal parameters tuning approach to avoid overfitting and handle multiple concepts, resulting in outperforming baselines in evaluations.

Large models for text-to-music generation have achieved significant progress, facilitating the creation of high-quality and varied musical compositions from provided text prompts. However, input text prompts may not precisely capture user requirements, particularly when the objective is to generate music that embodies a specific concept derived from a designated reference collection. In this paper, we propose a novel method for customized text-to-music generation, which can capture the concept from a two-minute reference music and generate a new piece of music conforming to the concept. We achieve this by fine-tuning a pretrained text-to-music model using the reference music. However, directly fine-tuning all parameters leads to overfitting issues. To address this problem, we propose a Pivotal Parameters Tuning method that enables the model to assimilate the new concept while preserving its original generative capabilities. Additionally, we identify a potential concept conflict when introducing multiple concepts into the pretrained model. We present a concept enhancement strategy to distinguish multiple concepts, enabling the fine-tuned model to generate music incorporating either individual or multiple concepts simultaneously. Since we are the first to work on the customized music generation task, we also introduce a new dataset and evaluation protocol for the new task. Our proposed Jen1-DreamStyler outperforms several baselines in both qualitative and quantitative evaluations. Demos will be available at https://www.jenmusic.ai/research#DreamStyler.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes