SDMMASJul 27, 2021

Audio-to-Score Alignment Using Deep Automatic Music Transcription

arXiv:2107.12854v316 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of precise music synchronization for applications like music education and analysis, though it appears incremental by building on existing transcription and alignment techniques.

The paper tackled audio-to-score alignment at the note-level by exploiting deep automatic music transcription and HMM-based methods, achieving a remarkable advancement beyond the state-of-the-art as confirmed by extensive tests on multiple datasets.

Audio-to-score alignment (A2SA) is a multimodal task consisting in the alignment of audio signals to music scores. Recent literature confirms the benefits of Automatic Music Transcription (AMT) for A2SA at the frame-level. In this work, we aim to elaborate on the exploitation of AMT Deep Learning (DL) models for achieving alignment at the note-level. We propose a method which benefits from HMM-based score-to-score alignment and AMT, showing a remarkable advancement beyond the state-of-the-art. We design a systematic procedure to take advantage of large datasets which do not offer an aligned score. Finally, we perform a thorough comparison and extensive tests on multiple datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes