LGSep 14, 2025

Decoding Musical Origins: Distinguishing Human and AI Composers

arXiv:2509.11369v1
Originality Incremental advance
AI Analysis

This provides a tool for tracing the origins of AI-generated music, addressing a domain-specific challenge in music analysis and AI content verification.

The study tackled the problem of distinguishing music composed by humans, rule-based algorithms, or LLMs by developing a novel music notation system called YNote and training a classification model using TF-IDF and SMOTE, achieving an accuracy of 98.25%.

With the rapid advancement of Large Language Models (LLMs), AI-driven music generation has become a vibrant and fruitful area of research. However, the representation of musical data remains a significant challenge. To address this, a novel, machine-learning-friendly music notation system, YNote, was developed. This study leverages YNote to train an effective classification model capable of distinguishing whether a piece of music was composed by a human (Native), a rule-based algorithm (Algorithm Generated), or an LLM (LLM Generated). We frame this as a text classification problem, applying the Term Frequency-Inverse Document Frequency (TF-IDF) algorithm to extract structural features from YNote sequences and using the Synthetic Minority Over-sampling Technique (SMOTE) to address data imbalance. The resulting model achieves an accuracy of 98.25%, successfully demonstrating that YNote retains sufficient stylistic information for analysis. More importantly, the model can identify the unique " technological fingerprints " left by different AI generation techniques, providing a powerful tool for tracing the origins of AI-generated content.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes