SDMay 13

Text2Score: Generating Sheet Music From Textual Prompts

arXiv:2605.1343177.7Has Code
Predicted impact top 18% in SD · last 90 daysOriginality Incremental advance
AI Analysis

For researchers in symbolic music generation, this work addresses the data scarcity bottleneck by proposing a training paradigm that avoids noisy text-music pairs, enabling text-driven sheet music generation.

Text2Score generates sheet music from natural language prompts using a two-stage framework (planning + execution) that bypasses scarce text-music pairs by deriving supervision from symbolic XML data. It outperforms pure LLM-based and end-to-end baselines on playability, readability, and prompt adherence.

Developing text-driven symbolic music generation models remains challenging due to the scarcity of aligned text-music datasets and the unreliability of automated captioning pipelines. While most efforts have focused on MIDI, sheet music representations are largely underexplored in text-driven generation. We present Text2Score, a two-stage framework comprising a planning stage and an execution stage for generating sheet music from natural language prompts. By deriving supervision signals directly from symbolic XML data, we propose an alternative training paradigm that bypasses noisy or scarce text-music pairs. In the planning stage, an LLM orchestrator translates a natural language prompt into a structured measure-wise plan defining musical attributes such as instruments, key, time signatures, harmony, etc. This plan is then consumed by a generative model in the execution stage to produce interleaved ABC notation conditioned on the plan's structural constraints. To assess output quality, we introduce an evaluation framework covering playability, readability, instrument utilization, structural complexity, and prompt adherence, validated by expert musicians. Text2Score consistently outperforms both a pure LLM-based agentic framework and three end-to-end baselines across objective and subjective dimensions. We open-source the dataset, code, evaluation set and LLM prompts used in this work; a demo is available on our project page (https://keshavbhandari.github.io/portfolio/text2score).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes