ASLGSDAug 27, 2024

Development of Large Annotated Music Datasets using HMM-based Forced Viterbi Alignment

arXiv:2408.14890v1h-index: 16
Originality Synthesis-oriented
AI Analysis

This provides a streamlined approach for generating datasets for monophonic instruments, aiding researchers in AMT, though it is incremental as it builds on existing alignment techniques.

The authors tackled the challenge of creating annotated music datasets for Automatic Music Transcription by developing a method using HMM-based forced Viterbi alignment on predefined guitar exercises, resulting in transcriptions with labels accurate to an average of 5ms and up to 10ms.

Datasets are essential for any machine learning task. Automatic Music Transcription (AMT) is one such task, where considerable amount of data is required depending on the way the solution is achieved. Considering the fact that a music dataset, complete with audio and its time-aligned transcriptions would require the effort of people with musical experience, it could be stated that the task becomes even more challenging. Musical experience is required in playing the musical instrument(s), and in annotating and verifying the transcriptions. We propose a method that would help in streamlining this process, making the task of obtaining a dataset from a particular instrument easy and efficient. We use predefined guitar exercises and hidden Markov model(HMM) based forced viterbi alignment to accomplish this. The guitar exercises are designed to be simple. Since the note sequence are already defined, HMM based forced viterbi alignment provides time-aligned transcriptions of these audio files. The onsets of the transcriptions are manually verified and the labels are accurate up to 10ms, averaging at 5ms. The contributions of the proposed work is two fold, i) a well streamlined and efficient method for generating datasets for any instrument, especially monophonic and, ii) an acoustic plectrum guitar dataset containing wave files and transcriptions in the form of label files. This method will aid as a preliminary step towards building concrete datasets for building AMT systems for different instruments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes