IVNov 19, 2024
Automatic staff reconstruction within SIMSSA proectLorenzo J. Tardon, Isabel Barbancho, Ana M. Barbancho et al.
The automatic analysis of scores has been a research topic of interest for the last few decades and still is since music databases that include musical scores are currently being created to make musical content available to the public, including scores of ancient music. For the correct analysis of music elements and their interpretation, the identification of staff lines is of key importance. In this paper, a scheme to post-process the output of a previous musical object identification system is described. This system allows the reconstruction by means of detection, tracking and interpolation of the staff lines of ancient scores from the digital Salzinnes Database. The scheme developed shows a remarkable performance on the specific task it was created for.
SDJul 6, 2025
High-Resolution Sustain Pedal Depth Estimation from Piano Audio Across Room AcousticsKun Fang, Hanwen Zhang, Ziyu Wang et al.
Piano sustain pedal detection has previously been approached as a binary on/off classification task, limiting its application in real-world piano performance scenarios where pedal depth significantly influences musical expression. This paper presents a novel approach for high-resolution estimation that predicts continuous pedal depth values. We introduce a Transformer-based architecture that not only matches state-of-the-art performance on the traditional binary classification task but also achieves high accuracy in continuous pedal depth estimation. Furthermore, by estimating continuous values, our model provides musically meaningful predictions for sustain pedal usage, whereas baseline models struggle to capture such nuanced expressions with their binary detection approach. Additionally, this paper investigates the influence of room acoustics on sustain pedal estimation using a synthetic dataset that includes varied acoustic conditions. We train our model with different combinations of room settings and test it in an unseen new environment using a "leave-one-out" approach. Our findings show that the two baseline models and ours are not robust to unseen room conditions. Statistical analysis further confirms that reverberation influences model predictions and introduces an overestimation bias.
SDMar 28, 2025
Teaching LLMs Music Theory with In-Context Learning and Chain-of-Thought Prompting: Pedagogical Strategies for MachinesLiam Pond, Ichiro Fujinaga
This study evaluates the baseline capabilities of Large Language Models (LLMs) like ChatGPT, Claude, and Gemini to learn concepts in music theory through in-context learning and chain-of-thought prompting. Using carefully designed prompts (in-context learning) and step-by-step worked examples (chain-of-thought prompting), we explore how LLMs can be taught increasingly complex material and how pedagogical strategies for human learners translate to educating machines. Performance is evaluated using questions from an official Canadian Royal Conservatory of Music (RCM) Level 6 examination, which covers a comprehensive range of topics, including interval and chord identification, key detection, cadence classification, and metrical analysis. Additionally, we evaluate the suitability of various music encoding formats for these tasks (ABC, Humdrum, MEI, MusicXML). All experiments were run both with and without contextual prompts. Results indicate that without context, ChatGPT with MEI performs the best at 52%, while with context, Claude with MEI performs the best at 75%. Future work will further refine prompts and expand to cover more advanced music theory concepts. This research contributes to the broader understanding of teaching LLMs and has applications for educators, students, and developers of AI music tools alike.