TinyMusician: On-Device Music Generation with Knowledge Distillation and Mixed Precision Quantization
This enables on-device music generation for users of smartphones and wearables, eliminating cloud dependency, but it is incremental as it builds upon existing distillation and quantization techniques.
The authors tackled the problem of deploying large transformer-based music generation models on edge devices by introducing TinyMusician, a lightweight model that retains 93% of the performance of MusicGen-Small with 55% less model size.
The success of the generative model has gained unprecedented attention in the music generation area. Transformer-based architectures have set new benchmarks for model performance. However, their practical adoption is hindered by some critical challenges: the demand for massive computational resources and inference time, due to their large number of parameters. These obstacles make them infeasible to deploy on edge devices, such as smartphones and wearables, with limited computational resources. In this work, we present TinyMusician, a lightweight music generation model distilled from MusicGen (a State-of-the-art music generation model). TinyMusician integrates two innovations: (i) Stage-mixed Bidirectional and Skewed KL-Divergence and (ii) Adaptive Mixed-Precision Quantization. The experimental results demonstrate that TinyMusician retains 93% of the MusicGen-Small performance with 55% less model size. TinyMusician is the first mobile-deployable music generation model that eliminates cloud dependency while maintaining high audio fidelity and efficient resource usage