CVApr 10

SynFlow: Scaling Up LiDAR Scene Flow Estimation with Synthetic Data

Qingwen Zhang, Xiaomeng Zhu, Chenhan Jiang, Patric Jensfelt

arXiv:2604.0941178.5Has Code

AI Analysis

This addresses the challenge of reliable 3D motion estimation for autonomous systems by providing a scalable synthetic data solution, representing a paradigm shift rather than an incremental improvement.

The paper tackles the problem of 3D dynamic perception hindered by scarce motion annotations by proposing SynFlow, a pipeline that generates a large-scale synthetic dataset for LiDAR scene flow, achieving a 34x scale-up in annotated volume and enabling zero-shot generalization that outperforms state-of-the-art methods by 31.8% on TruckScenes.

Reliable 3D dynamic perception requires models that can anticipate motion beyond predefined categories, yet progress is hindered by the scarcity of dense, high-quality motion annotations. While self-supervision on unlabeled real data offers a path forward, empirical evidence suggests that scaling unlabeled data fails to close the performance gap due to noisy proxy signals. In this paper, we propose a shift in paradigm: learning robust real-world motion priors entirely from scalable simulation. We introduce SynFlow, a data generation pipeline that generates large-scale synthetic dataset specifically designed for LiDAR scene flow. Unlike prior works that prioritize sensor-specific realism, SynFlow employs a motion-oriented strategy to synthesize diverse kinematic patterns across 4,000 sequences ($\sim$940k frames), termed SynFlow-4k. This represents a 34x scale-up in annotated volume over existing real-world benchmarks. Our experiments demonstrate that SynFlow-4k provides a highly domain-invariant motion prior. In a zero-shot regime, models trained exclusively on our synthetic data generalize across multiple real-world benchmarks, rivaling in-domain supervised baselines on nuScenes and outperforming state-of-the-art methods on TruckScenes by 31.8%. Furthermore, SynFlow-4k serves as a label-efficient foundation: fine-tuning with only 5% of real-world labels surpasses models trained from scratch on the full available budget. We open-source the pipeline and dataset to facilitate research in generalizable 3D motion estimation. More detail can be found at https://kin-zhang.github.io/SynFlow.

View on arXiv PDF Code

Similar