SDAIHCLGASJan 25, 2025

Music Generation using Human-In-The-Loop Reinforcement Learning

arXiv:2501.15304v13 citationsh-index: 1BigData
Originality Synthesis-oriented
AI Analysis

It addresses music generation for users by enabling personalized composition, but it is incremental as it applies an existing HITL RL method to a new domain.

This paper tackles the problem of real-time music generation by combining Human-In-The-Loop Reinforcement Learning with music theory principles, resulting in a system that iteratively improves musical compositions based on user feedback.

This paper presents an approach that combines Human-In-The-Loop Reinforcement Learning (HITL RL) with principles derived from music theory to facilitate real-time generation of musical compositions. HITL RL, previously employed in diverse applications such as modelling humanoid robot mechanics and enhancing language models, harnesses human feedback to refine the training process. In this study, we develop a HILT RL framework that can leverage the constraints and principles in music theory. In particular, we propose an episodic tabular Q-learning algorithm with an epsilon-greedy exploration policy. The system generates musical tracks (compositions), continuously enhancing its quality through iterative human-in-the-loop feedback. The reward function for this process is the subjective musical taste of the user.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes