SDAIASMay 22, 2024

End-to-End Real-World Polyphonic Piano Audio-to-Score Transcription with Hierarchical Decoding

arXiv:2405.13527v15 citationsh-index: 2IJCAI
Originality Highly original
AI Analysis

This addresses an important but underexplored task for music composition, practice, and analysis, though it is incremental as it builds on existing end-to-end systems.

The paper tackles the problem of transcribing polyphonic piano audio to musical scores, which had difficulties with bar-level information and was limited to synthetic data, by proposing a hierarchical decoder and two-stage training scheme that achieved state-of-the-art performance on synthetic data and the first experiment on human recordings.

Piano audio-to-score transcription (A2S) is an important yet underexplored task with extensive applications for music composition, practice, and analysis. However, existing end-to-end piano A2S systems faced difficulties in retrieving bar-level information such as key and time signatures, and have been trained and evaluated with only synthetic data. To address these limitations, we propose a sequence-to-sequence (Seq2Seq) model with a hierarchical decoder that aligns with the hierarchical structure of musical scores, enabling the transcription of score information at both the bar and note levels by multi-task learning. To bridge the gap between synthetic data and recordings of human performance, we propose a two-stage training scheme, which involves pre-training the model using an expressive performance rendering (EPR) system on synthetic audio, followed by fine-tuning the model using recordings of human performance. To preserve the voicing structure for score reconstruction, we propose a pre-processing method for **Kern scores in scenarios with an unconstrained number of voices. Experimental results support the effectiveness of our proposed approaches, in terms of both transcription performance on synthetic audio data in comparison to the current state-of-the-art, and the first experiment on human recordings.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes