ASLGSDSep 19, 2025

Similarity-Guided Diffusion for Long-Gap Music Inpainting

arXiv:2509.16342v1h-index: 9
Originality Incremental advance
AI Analysis

This addresses music inpainting for corrupted recordings, offering an incremental improvement for long-gap scenarios.

The paper tackled the problem of reconstructing missing segments in music recordings over long gaps, where diffusion models struggle with plausibility. The result was a hybrid method that improved perceptual plausibility for 2-second gaps in piano music compared to unguided diffusion and similarity search alone.

Music inpainting aims to reconstruct missing segments of a corrupted recording. While diffusion-based generative models improve reconstruction for medium-length gaps, they often struggle to preserve musical plausibility over multi-second gaps. We introduce Similarity-Guided Diffusion Posterior Sampling (SimDPS), a hybrid method that combines diffusion-based inference with similarity search. Candidate segments are first retrieved from a corpus based on contextual similarity, then incorporated into a modified likelihood that guides the diffusion process toward contextually consistent reconstructions. Subjective evaluation on piano music inpainting with 2-s gaps shows that the proposed SimDPS method enhances perceptual plausibility compared to unguided diffusion and frequently outperforms similarity search alone when moderately similar candidates are available. These results demonstrate the potential of a hybrid similarity approach for diffusion-based audio enhancement with long gaps.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes