SD ASJul 27, 2021

PKSpell: Data-Driven Pitch Spelling and Key Signature Estimation

Francesco Foscarin, Nicolas Audebert, Raphaël Fournier-S'Niehotta

arXiv:2107.14009v110.814 citationsh-index: 16Has Code

Originality Incremental advance

AI Analysis

This addresses a fundamental problem in music information retrieval for tasks like harmonic analysis and digital library search, but it is incremental as it builds on existing data-driven methods.

The authors tackled the joint estimation of pitch spelling and key signatures from MIDI files, achieving strong key signature performance and establishing a new state-of-the-art on the MuseData pitch spelling dataset.

We present PKSpell: a data-driven approach for the joint estimation of pitch spelling and key signatures from MIDI files. Both elements are fundamental for the production of a full-fledged musical score and facilitate many MIR tasks such as harmonic analysis, section identification, melodic similarity, and search in a digital music library. We design a deep recurrent neural network model that only requires information readily available in all kinds of MIDI files, including performances, or other symbolic encodings. We release a model trained on the ASAP dataset. Our system can be used with these pre-trained parameters and is easy to integrate into a MIR pipeline. We also propose a data augmentation procedure that helps retraining on small datasets. PKSpell achieves strong key signature estimation performance on a challenging dataset. Most importantly, this model establishes a new state-of-the-art performance on the MuseData pitch spelling dataset without retraining.

View on arXiv PDF Code

Similar