FretNet: Continuous-Valued Pitch Contour Streaming for Polyphonic Guitar Tablature Transcription
This addresses a specific challenge in music transcription for guitarists by enabling more accurate representation of playing techniques like pitch modulation, though it is incremental as it builds on existing AMT methods.
The paper tackles the problem of Automatic Music Transcription for guitar tablature by estimating continuous-valued pitch contours to better represent pitch modulation techniques, resulting in significant improvements in Multi-Pitch Estimation resolution and competitive tablature estimation compared to baselines.
In recent years, the task of Automatic Music Transcription (AMT), whereby various attributes of music notes are estimated from audio, has received increasing attention. At the same time, the related task of Multi-Pitch Estimation (MPE) remains a challenging but necessary component of almost all AMT approaches, even if only implicitly. In the context of AMT, pitch information is typically quantized to the nominal pitches of the Western music scale. Even in more general contexts, MPE systems typically produce pitch predictions with some degree of quantization. In certain applications of AMT, such as Guitar Tablature Transcription (GTT), it is more meaningful to estimate continuous-valued pitch contours. Guitar tablature has the capacity to represent various playing techniques, some of which involve pitch modulation. Contemporary approaches to AMT do not adequately address pitch modulation, and offer only less quantization at the expense of more model complexity. In this paper, we present a GTT formulation that estimates continuous-valued pitch contours, grouping them according to their string and fret of origin. We demonstrate that for this task, the proposed method significantly improves the resolution of MPE and simultaneously yields tablature estimation results competitive with baseline models.