ReaLJam: Real-Time Human-AI Music Jamming with Reinforcement Learning-Tuned Transformers
This addresses the need for real-time cooperative musical applications for musicians, though it is incremental as it builds on existing generative AI models.
The authors tackled the problem of enabling real-time human-AI music jamming by introducing ReaLJam, an interface and protocol using reinforcement learning-tuned Transformers with anticipation for low-latency interaction, and user studies showed it enabled enjoyable and musically interesting sessions.
Recent advances in generative artificial intelligence (AI) have created models capable of high-quality musical content generation. However, little consideration is given to how to use these models for real-time or cooperative jamming musical applications because of crucial required features: low latency, the ability to communicate planned actions, and the ability to adapt to user input in real-time. To support these needs, we introduce ReaLJam, an interface and protocol for live musical jamming sessions between a human and a Transformer-based AI agent trained with reinforcement learning. We enable real-time interactions using the concept of anticipation, where the agent continually predicts how the performance will unfold and visually conveys its plan to the user. We conduct a user study where experienced musicians jam in real-time with the agent through ReaLJam. Our results demonstrate that ReaLJam enables enjoyable and musically interesting sessions, and we uncover important takeaways for future work.