SDAICLIRMMASSep 20, 2022

Setting the rhythm scene: deep learning-based drum loop generation from arbitrary language cues

arXiv:2209.10016v11 citationsh-index: 8
Originality Incremental advance
AI Analysis

This provides a tool for electronic music and audiovisual production, aiding professionals and hobbyists, but is incremental in combining existing generative AI with new data extraction.

The paper tackles generating 2-bar drum loops from arbitrary English language cues for music composition and performance, presenting a method that includes a novel consensus drum track extraction technique to create training data.

Generative artificial intelligence models can be a valuable aid to music composition and live performance, both to aid the professional musician and to help democratize the music creation process for hobbyists. Here we present a novel method that, given an English word or phrase, generates 2 compasses of a 4-piece drum pattern that embodies the "mood" of the given language cue, or that could be used for an audiovisual scene described by the language cue. We envision this tool as composition aid for electronic music and audiovisual soundtrack production, or an improvisation tool for live performance. In order to produce the training samples for this model, besides manual annotation of the "scene" or "mood" terms, we have designed a novel method to extract the consensus drum track of any song. This consists of a 2-bar, 4-piece drum pattern that represents the main percussive motif of a song, which could be imported into any music loop device or live looping software. These two key components (drum pattern generation from a generalizable input, and consensus percussion extraction) present a novel approach to computer-aided composition and provide a stepping stone for more comprehensive rhythm generation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes