SDLGASSep 20, 2025

Barwise Section Boundary Detection in Symbolic Music Using Convolutional Neural Networks

arXiv:2509.16566v1ISMIR
Originality Incremental advance
AI Analysis

This addresses a key subproblem in Music Structure Analysis for symbolic music, offering a domain-specific improvement over existing audio-focused techniques.

The paper tackles section boundary detection in symbolic music by introducing a human-annotated MIDI dataset and a deep learning model with a novel encoding scheme, achieving an F1 score of 0.77 and outperforming audio-based methods by 0.22 to 0.31.

Current methods for Music Structure Analysis (MSA) focus primarily on audio data. While symbolic music can be synthesized into audio and analyzed using existing MSA techniques, such an approach does not exploit symbolic music's rich explicit representation of pitch, timing, and instrumentation. A key subproblem of MSA is section boundary detection-determining whether a given point in time marks the transition between musical sections. In this paper, we study automatic section boundary detection for symbolic music. First, we introduce a human-annotated MIDI dataset for section boundary detection, consisting of metadata from 6134 MIDI files that we manually curated from the Lakh MIDI dataset. Second, we train a deep learning model to classify the presence of section boundaries within a fixed-length musical window. Our data representation involves a novel encoding scheme based on synthesized overtones to encode arbitrary MIDI instrumentations into 3-channel piano rolls. Our model achieves an F1 score of 0.77, improving over the analogous audio-based supervised learning approach and the unsupervised block-matching segmentation (CBM) audio approach by 0.22 and 0.31, respectively. We release our dataset, code, and models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes