SIMSplat: Predictive Driving Scene Editing with Language-aligned 4D Gaussian Splatting
This work addresses the need for more flexible and realistic driving scene editing tools for researchers and developers in autonomous driving, though it appears incremental as it builds on existing Gaussian splatting and language alignment techniques.
The paper tackles the problem of realistic and efficient driving scene manipulation by introducing SIMSplat, a language-aligned 4D Gaussian splatting editor that enables intuitive editing with natural language prompts and supports detailed object-level modifications, including adding objects and adjusting trajectories, with experiments on the Waymo dataset showing extensive editing capabilities.
Driving scene manipulation with sensor data is emerging as a promising alternative to traditional virtual driving simulators. However, existing frameworks struggle to generate realistic scenarios efficiently due to limited editing capabilities. To address these challenges, we present SIMSplat, a predictive driving scene editor with language-aligned Gaussian splatting. As a language-controlled editor, SIMSplat enables intuitive manipulation using natural language prompts. By aligning language with Gaussian-reconstructed scenes, it further supports direct querying of road objects, allowing precise and flexible editing. Our method provides detailed object-level editing, including adding new objects and modifying the trajectories of both vehicles and pedestrians, while also incorporating predictive path refinement through multi-agent motion prediction to generate realistic interactions among all agents in the scene. Experiments on the Waymo dataset demonstrate SIMSplat's extensive editing capabilities and adaptability across a wide range of scenarios. Project page: https://sungyeonparkk.github.io/simsplat/