SDAILGASSPJan 24, 2024

Expressive Acoustic Guitar Sound Synthesis with an Instrument-Specific Input Representation and Diffusion Outpainting

arXiv:2401.13498v18 citationsICASSP
Originality Incremental advance
AI Analysis

This work addresses the problem of realistic guitar sound synthesis for music production and AI applications, representing an incremental improvement with a domain-specific focus.

The paper tackles the challenge of synthesizing expressive acoustic guitar sounds by introducing a custom input representation called guitarroll and using diffusion-based outpainting for long-term consistency, achieving higher audio quality and more realistic timbre than previous methods.

Synthesizing performing guitar sound is a highly challenging task due to the polyphony and high variability in expression. Recently, deep generative models have shown promising results in synthesizing expressive polyphonic instrument sounds from music scores, often using a generic MIDI input. In this work, we propose an expressive acoustic guitar sound synthesis model with a customized input representation to the instrument, which we call guitarroll. We implement the proposed approach using diffusion-based outpainting which can generate audio with long-term consistency. To overcome the lack of MIDI/audio-paired datasets, we used not only an existing guitar dataset but also collected data from a high quality sample-based guitar synthesizer. Through quantitative and qualitative evaluations, we show that our proposed model has higher audio quality than the baseline model and generates more realistic timbre sounds than the previous leading work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes