SDAIASAug 28, 2023

InstructME: An Instruction Guided Music Edit And Remix Framework with Latent Diffusion Models

arXiv:2308.14360v324 citationsh-index: 44
Originality Incremental advance
AI Analysis

This addresses the challenge of automated music editing for users without expertise, though it appears incremental as it builds on existing diffusion models with domain-specific adaptations.

The authors tackled the problem of music editing and remixing by developing InstructME, a framework based on latent diffusion models, which significantly outperforms prior systems in music quality, text relevance, and harmony as shown in evaluations.

Music editing primarily entails the modification of instrument tracks or remixing in the whole, which offers a novel reinterpretation of the original piece through a series of operations. These music processing methods hold immense potential across various applications but demand substantial expertise. Prior methodologies, although effective for image and audio modifications, falter when directly applied to music. This is attributed to music's distinctive data nature, where such methods can inadvertently compromise the intrinsic harmony and coherence of music. In this paper, we develop InstructME, an Instruction guided Music Editing and remixing framework based on latent diffusion models. Our framework fortifies the U-Net with multi-scale aggregation in order to maintain consistency before and after editing. In addition, we introduce chord progression matrix as condition information and incorporate it in the semantic space to improve melodic harmony while editing. For accommodating extended musical pieces, InstructME employs a chunk transformer, enabling it to discern long-term temporal dependencies within music sequences. We tested InstructME in instrument-editing, remixing, and multi-round editing. Both subjective and objective evaluations indicate that our proposed method significantly surpasses preceding systems in music quality, text relevance and harmony. Demo samples are available at https://musicedit.github.io/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes