Statistical Learning and Estimation of Piano Fingering
This work addresses the problem of piano fingering estimation for applications in music performance understanding, assistance, and education, representing an incremental improvement over existing methods.
The paper tackled the problem of automatically estimating piano fingering by developing data-driven statistical models, specifically hidden Markov models (HMMs) and their higher-order extensions, and found that high-order HMMs outperformed other methods, including deep neural networks and constraint-based approaches, in terms of estimation accuracies.
Automatic estimation of piano fingering is important for understanding the computational process of music performance and applicable to performance assistance and education systems. While a natural way to formulate the quality of fingerings is to construct models of the constraints/costs of performance, it is generally difficult to find appropriate parameter values for these models. Here we study an alternative data-driven approach based on statistical modeling in which the appropriateness of a given fingering is described by probabilities. Specifically, we construct two types of hidden Markov models (HMMs) and their higher-order extensions. We also study deep neural network (DNN)-based methods for comparison. Using a newly released dataset of fingering annotations, we conduct systematic evaluations of these models as well as a representative constraint-based method. We find that the methods based on high-order HMMs outperform the other methods in terms of estimation accuracies. We also quantitatively study individual difference of fingering and propose evaluation measures that can be used with multiple ground truth data. We conclude that the HMM-based methods are currently state of the art and generate acceptable fingerings in most parts and that they have certain limitations such as ignorance of phrase boundaries and interdependence of the two hands.