Multiverse: Language-Conditioned Multi-Game Level Blending via Shared Representation
This work addresses the challenge of intuitive, language-controlled procedural content generation for multiple game domains, offering a novel approach to cross-game level blending.
The paper tackles the problem of generating game levels from natural language across multiple games by learning a shared latent space that aligns text and level structures, enabling cross-game level blending and zero-shot generation from compositional prompts. The model significantly improves blending quality within the same game genre and provides a unified representation for multi-game content generation.
Text-to-level generation aims to translate natural language descriptions into structured game levels, enabling intuitive control over procedural content generation. While prior text-to-level generators are typically limited to a single game domain, extending language-conditioned generation to multiple games requires learning representations that capture structural relationships across domains. We propose Multiverse, a language-conditioned multi-game level generator that enables cross-game level blending through textual specifications. The model learns a shared latent space aligning textual instructions and level structures, while a threshold-based multi-positive contrastive supervision links semantically related levels across games. This representation allows language to guide which structural characteristics should be preserved when combining content from different games, enabling controllable blending through latent interpolation and zero-shot generation from compositional textual prompts. Experiments show that the learned representation supports controllable cross-game level blending and significantly improves blending quality within the same game genre, while providing a unified representation for language-conditioned multi-game content generation.