Recognizing Ornaments in Vocal Indian Art Music with Active Annotation
This addresses a key bottleneck in music information retrieval for applications like pedagogy and genre classification, though it is incremental as it builds on existing methods with new data and annotations.
The paper tackled the problem of recognizing vocal ornaments in Indian classical music by introducing the Rāga Ornamentation Detection (ROD) dataset and a deep time-series model, achieving superior performance over a baseline CRNN.
Ornamentations, embellishments, or microtonal inflections are essential to melodic expression across many musical traditions, adding depth, nuance, and emotional impact to performances. Recognizing ornamentations in singing voices is key to MIR, with potential applications in music pedagogy, singer identification, genre classification, and controlled singing voice generation. However, the lack of annotated datasets and specialized modeling approaches remains a major obstacle for progress in this research area. In this work, we introduce Rāga Ornamentation Detection (ROD), a novel dataset comprising Indian classical music recordings curated by expert musicians. The dataset is annotated using a custom Human-in-the-Loop tool for six vocal ornaments marked as event-based labels. Using this dataset, we develop an ornamentation detection model based on deep time-series analysis, preserving ornament boundaries during the chunking of long audio recordings. We conduct experiments using different train-test configurations within the ROD dataset and also evaluate our approach on a separate, manually annotated dataset of Indian classical concert recordings. Our experimental results support the superior performance of our proposed approach over the baseline CRNN.