SDAIApr 6

YMIR: A new Benchmark Dataset and Model for Arabic Yemeni Music Genre Classification Using Convolutional Neural Networks

arXiv:2604.050113.7h-index: 6
Predicted impact top 95% in SD · last 90 daysOriginality Synthesis-oriented
AI Analysis

This work addresses the problem of limited benchmarks for non-Western music genres, providing a dataset and model for Yemeni music classification, though it is incremental as it applies existing CNN methods to a new domain.

The paper tackled the underrepresentation of culturally specific music in automatic genre classification by introducing the YMIR dataset of 1,475 Yemeni audio clips and the YMCM model, which achieved 98.8% accuracy using Mel-spectrogram features.

Automatic music genre classification is a major task in music information retrieval; however, most current benchmarks and models have been developed primarily for Western music, leaving culturally specific traditions underrepresented. In this paper, we introduce the Yemeni Music Information Retrieval (YMIR) dataset, which contains 1,475 carefully selected audio clips covering five traditional Yemeni genres: Sanaani, Hadhrami, Lahji, Tihami, and Adeni. The dataset was labeled by five Yemeni music experts following a clear and structured protocol, resulting in strong inter-annotator agreement (Fleiss kappa = 0.85). We also propose the Yemeni Music Classification Model (YMCM), a convolutional neural network (CNN)-based system designed to classify music genres from time-frequency features. Using a consistent preprocessing pipeline, we perform a systematic comparison across six experimental groups and five different architectures, resulting in a total of 30 experiments. Specifically, we evaluate several feature representations, including Mel-spectrograms, Chroma, FilterBank, and MFCCs with 13, 20, and 40 coefficients, and benchmark YMCM against standard models (AlexNet, VGG16, MobileNet, and a baseline CNN) under the same experimental conditions. The experimental findings reveal that YMCM is the most effective, achieving the highest accuracy of 98.8% with Mel-spectrogram features. The results also provide practical insights into the relationship between feature representation and model capacity. The findings establish YMIR as a useful benchmark and YMCM as a strong baseline for classifying Yemeni music genres.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes