SDAIASSep 10, 2024

Sines, Transient, Noise Neural Modeling of Piano Notes

arXiv:2409.06513v31 citationsh-index: 3
Originality Incremental advance
AI Analysis

This is an incremental improvement for audio synthesis and music technology applications.

The paper tackled piano sound emulation by proposing a differentiable spectral modeling synthesizer that decomposes sounds into sines, transient, and noise components, with results showing accurate energy distribution in spectra but limitations in modeling the attack phase of notes.

This paper introduces a novel method for emulating piano sounds. We propose to exploit the sines, transient, and noise decomposition to design a differentiable spectral modeling synthesizer replicating piano notes. Three sub-modules learn these components from piano recordings and generate the corresponding harmonic, transient, and noise signals. Splitting the emulation into three independently trainable models reduces the modeling tasks' complexity. The quasi-harmonic content is produced using a differentiable sinusoidal model guided by physics-derived formulas, whose parameters are automatically estimated from audio recordings. The noise sub-module uses a learnable time-varying filter, and the transients are generated using a deep convolutional network. From singular notes, we emulate the coupling between different keys in trichords with a convolutional-based network. Results show the model matches the partial distribution of the target while predicting the energy in the higher part of the spectrum presents more challenges. The energy distribution in the spectra of the transient and noise components is accurate overall. While the model is more computationally and memory efficient, perceptual tests reveal limitations in accurately modeling the attack phase of notes. Despite this, it generally achieves perceptual accuracy in emulating single notes and trichords.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes