Fully Procedural Synthetic Data from Simple Rules for Multi-View Stereo
This addresses the data scarcity and curation cost problem for researchers and practitioners in computer vision, offering an efficient alternative to manual data collection, though it is incremental in improving synthetic data generation methods.
The paper tackles the problem of generating synthetic training data for multi-view stereo by introducing SimpleProc, a fully procedural generator using simple rules like NURBS, displacement, and texture patterns; it achieves superior results with 8,000 images compared to manually curated data at the same scale and matches or exceeds performance with 352,000 images versus models trained on over 692,000 curated images.
In this paper, we explore the design space of procedural rules for multi-view stereo (MVS). We demonstrate that we can generate effective training data using SimpleProc: a new, fully procedural generator driven by a very small set of rules using Non-Uniform Rational Basis Splines (NURBS), as well as basic displacement and texture patterns. At a modest scale of 8,000 images, our approach achieves superior results compared to manually curated images (at the same scale) sourced from games and real-world objects. When scaled to 352,000 images, our method yields performance comparable to--and in several benchmarks, exceeding--models trained on over 692,000 manually curated images. The source code and the data are available at https://github.com/princeton-vl/SimpleProc.