Not All Points Are Equal: Uncertainty-Aware 4D LiDAR Scene Synthesis
For embodied AI and autonomous driving, this method improves the realism of generated LiDAR sequences by focusing modeling capacity on perceptually difficult regions.
U4D introduces an uncertainty-aware framework for 4D LiDAR scene synthesis that prioritizes high-uncertainty regions (e.g., distant surfaces, occluded boundaries) using a two-stage diffusion process, achieving state-of-the-art fidelity and temporal consistency on nuScenes and SemanticKITTI.
Constructing faithful 4D worlds from LiDAR-acquired sequences is crucial for embodied AI, yet current generative frameworks apply uniform modeling capacity across all spatial regions. This ignores that perceptual difficulty varies dramatically within a single scan: distant surfaces, occluded boundaries, and small-scale objects carry far higher uncertainty than well-observed structures. We present U4D, a new framework that explicitly leverages spatial uncertainty to guide LiDAR scene generation in a "hard-to-easy" schedule. U4D derives per-point uncertainty maps via Shannon Entropy from a pretrained segmentor, then applies an unconditional diffusion stage to synthesize high-entropy areas with precise geometry, followed by a conditional completion stage that fills in the remaining regions using these structures as priors. A MoST (Mixture of Spatio-Temporal) block further maintains cross-frame coherence by dynamically balancing spatial detail and temporal continuity. Extensive experiments on nuScenes and SemanticKITTI demonstrate state-of-the-art scene fidelity, temporal consistency, and downstream performance.