SDAIASFeb 20, 2024

Structure-informed Positional Encoding for Music Generation

arXiv:2402.13301v26 citationsh-index: 9ICASSP
AI Analysis

This addresses the issue of incoherent music generation for AI music applications, but it is incremental as it builds on existing Transformer methods with new positional encodings.

The paper tackled the problem of music generated by deep learning lacking coherence and long-term organization by proposing a structure-informed positional encoding framework for Transformers, which improved melodic and structural consistency in symbolic music generation tasks.

Music generated by deep learning methods often suffers from a lack of coherence and long-term organization. Yet, multi-scale hierarchical structure is a distinctive feature of music signals. To leverage this information, we propose a structure-informed positional encoding framework for music generation with Transformers. We design three variants in terms of absolute, relative and non-stationary positional information. We comprehensively test them on two symbolic music generation tasks: next-timestep prediction and accompaniment generation. As a comparison, we choose multiple baselines from the literature and demonstrate the merits of our methods using several musically-motivated evaluation metrics. In particular, our methods improve the melodic and structural consistency of the generated pieces.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes