Event-Driven Storytelling with Multiple Lifelike Humans in a 3D Scene
This work addresses the challenge of creating lively virtual scenes with realistic human interactions, which is important for applications like gaming and simulation, though it appears incremental as it builds on existing LLM and motion synthesis methods.
The paper tackles the problem of generating contextual motions for multiple humans in a 3D scene by proposing a framework that uses a large language model to break down the task into subproblems, enabling scalable and diverse multi-agent behavior. Results from benchmark tests and user studies show the framework effectively captures scene context with high scalability.
In this work, we propose a framework that creates a lively virtual dynamic scene with contextual motions of multiple humans. Generating multi-human contextual motion requires holistic reasoning over dynamic relationships among human-human and human-scene interactions. We adapt the power of a large language model (LLM) to digest the contextual complexity within textual input and convert the task into tangible subproblems such that we can generate multi-agent behavior beyond the scale that was not considered before. Specifically, our event generator formulates the temporal progression of a dynamic scene into a sequence of small events. Each event calls for a well-defined motion involving relevant characters and objects. Next, we synthesize the motions of characters at positions sampled based on spatial guidance. We employ a high-level module to deliver scalable yet comprehensive context, translating events into relative descriptions that enable the retrieval of precise coordinates. As the first to address this problem at scale and with diversity, we offer a benchmark to assess diverse aspects of contextual reasoning. Benchmark results and user studies show that our framework effectively captures scene context with high scalability. The code and benchmark, along with result videos, are available at our project page: https://rms0329.github.io/Event-Driven-Storytelling/.