MetaBGM: Dynamic Soundtrack Transformation For Continuous Multi-Scene Experiences With Ambient Awareness And Personalization
This addresses the need for personalized and adaptive audio in interactive media, though it appears incremental as it builds on existing audio generation models with a new data transformation method.
The paper tackles the problem of generating background music that adapts to dynamic scenes and user interactions in interactive applications like games or movies, and demonstrates that MetaBGM effectively produces contextually relevant and dynamic soundtracks.
This paper introduces MetaBGM, a groundbreaking framework for generating background music that adapts to dynamic scenes and real-time user interactions. We define multi-scene as variations in environmental contexts, such as transitions in game settings or movie scenes. To tackle the challenge of converting backend data into music description texts for audio generation models, MetaBGM employs a novel two-stage generation approach that transforms continuous scene and user state data into these texts, which are then fed into an audio generation model for real-time soundtrack creation. Experimental results demonstrate that MetaBGM effectively generates contextually relevant and dynamic background music for interactive applications.