SD AI LG MM ASJul 14, 2022

Multitrack Music Transformer

Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley, Taylor Berg-Kirkpatrick

arXiv:2207.06983v423.058 citationsh-index: 72Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of efficient multitrack music generation for real-time creative applications, but it is incremental as it builds on existing transformer-based methods with a new representation.

The authors tackled the problem of generating multitrack music with transformer models, which previously had limitations in instrument count, segment length, and slow inference due to memory issues. They proposed a new representation and model that achieved comparable performance to state-of-the-art systems, with substantial speedups and memory reductions, making it suitable for real-time applications.

Existing approaches for generating multitrack music with transformer models have been limited in terms of the number of instruments, the length of the music segments and slow inference. This is partly due to the memory requirements of the lengthy input sequences necessitated by existing representations. In this work, we propose a new multitrack music representation that allows a diverse set of instruments while keeping a short sequence length. Our proposed Multitrack Music Transformer (MMT) achieves comparable performance with state-of-the-art systems, landing in between two recently proposed models in a subjective listening test, while achieving substantial speedups and memory reductions over both, making the method attractive for real time improvisation or near real time creative applications. Further, we propose a new measure for analyzing musical self-attention and show that the trained model attends more to notes that form a consonant interval with the current note and to notes that are 4N beats away from the current step.

View on arXiv PDF Code

Similar