SDLGASNov 14, 2024

Local deployment of large-scale music AI models on commodity hardware

arXiv:2411.09625v13 citationsh-index: 3
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of making advanced music AI models accessible to music software developers on standard hardware, though it is incremental as it builds on existing models and frameworks.

The researchers tackled the challenge of deploying large-scale music AI models on commodity hardware by creating MIDInfinite, a web application that generates symbolic music locally, achieving generation speeds of 51 notes per second on an M3 Macbook Pro, with 72.9% of generations being faster than real-time playback.

We present the MIDInfinite, a web application capable of generating symbolic music using a large-scale generative AI model locally on commodity hardware. Creating this demo involved porting the Anticipatory Music Transformer, a large language model (LLM) pre-trained on the Lakh MIDI dataset, to the Machine Learning Compilation (MLC) framework. Once the model is ported, MLC facilitates inference on a variety of runtimes including C++, mobile, and the browser. We envision that MLC has the potential to bridge the gap between the landscape of increasingly capable music AI models and technology more familiar to music software developers. As a proof of concept, we build a web application that allows users to generate endless streams of multi-instrumental MIDI in the browser, either from scratch or conditioned on a prompt. On commodity hardware (an M3 Macbook Pro), our demo can generate 51 notes per second, which is faster than real-time playback for 72.9% of generations, and increases to 86.3% with 2 seconds of upfront buffering.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes