Towards Score Following in Sheet Music Images
This addresses the challenge of real-time music synchronization for musicians or automated systems, though it is incremental as it builds on existing neural network techniques.
The paper tackles the problem of matching short music audio snippets to their corresponding positions in sheet music images using an end-to-end multi-modal convolutional neural network, achieving feasibility in score following with deep learning methods.
This paper addresses the matching of short music audio snippets to the corresponding pixel location in images of sheet music. A system is presented that simultaneously learns to read notes, listens to music and matches the currently played music to its corresponding notes in the sheet. It consists of an end-to-end multi-modal convolutional neural network that takes as input images of sheet music and spectrograms of the respective audio snippets. It learns to predict, for a given unseen audio snippet (covering approximately one bar of music), the corresponding position in the respective score line. Our results suggest that with the use of (deep) neural networks -- which have proven to be powerful image processing models -- working with sheet music becomes feasible and a promising future research direction.