SDLGASOct 19, 2023

Audio Editing with Non-Rigid Text Prompts

arXiv:2310.12858v312 citationsh-index: 46
Originality Incremental advance
AI Analysis

This addresses audio editing for creative or professional users, but appears incremental as it builds on existing text-prompted audio generation models.

The paper tackles audio editing using non-rigid text prompts for tasks like addition, style transfer, and in-painting, achieving results that outperform Audio-LDM in faithfulness to input audio, particularly in preserving onsets and offsets of audio events.

In this paper, we explore audio-editing with non-rigid text edits. We show that the proposed editing pipeline is able to create audio edits that remain faithful to the input audio. We explore text prompts that perform addition, style transfer, and in-painting. We quantitatively and qualitatively show that the edits are able to obtain results which outperform Audio-LDM, a recently released text-prompted audio generation model. Qualitative inspection of the results points out that the edits given by our approach remain more faithful to the input audio in terms of keeping the original onsets and offsets of the audio events.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes