Unsupervised Lead Sheet Generation via Semantic Compression
This work addresses the need for high-quality lead sheets in generative music research, offering a novel approach for tasks like multitrack music generation and automatic arrangement, though it is incremental in improving upon existing methods.
The paper tackles the problem of generating lead sheets from full scores by framing it as an unsupervised music compression task, and introduces Lead-AE, a model that improves upon deterministic baselines in both automatic and human evaluations, producing coherent reductions of multitrack scores.
Lead sheets have become commonplace in generative music research, being used as an initial compressed representation for downstream tasks like multitrack music generation and automatic arrangement. Despite this, researchers have often fallen back on deterministic reduction methods (such as the skyline algorithm) to generate lead sheets when seeking paired lead sheets and full scores, with little attention being paid toward the quality of the lead sheets themselves and how they accurately reflect their orchestrated counterparts. To address these issues, we propose the problem of conditional lead sheet generation (i.e. generating a lead sheet given its full score version), and show that this task can be formulated as an unsupervised music compression task, where the lead sheet represents a compressed latent version of the score. We introduce a novel model, called Lead-AE, that models the lead sheets as a discrete subselection of the original sequence, using a differentiable top-k operator to allow for controllable local sparsity constraints. Across both automatic proxy tasks and direct human evaluations, we find that our method improves upon the established deterministic baseline and produces coherent reductions of large multitrack scores.