WinSyn: A High Resolution Testbed for Synthetic Data
This work addresses the problem of synthetic data generation for computer vision tasks, particularly in semantic segmentation, but it is incremental as it builds on existing procedural modeling techniques.
The researchers tackled the challenge of creating high-quality synthetic data for training semantic segmentation networks by introducing WinSyn, a dataset and testbed with 89,318 high-resolution window crops and 21,290 synthetic images, and found that tuning procedural models can identify key factors influencing fidelity, though current techniques struggle to replicate real-world spatial semantics.
We present WinSyn, a unique dataset and testbed for creating high-quality synthetic data with procedural modeling techniques. The dataset contains high-resolution photographs of windows, selected from locations around the world, with 89,318 individual window crops showcasing diverse geometric and material characteristics. We evaluate a procedural model by training semantic segmentation networks on both synthetic and real images and then comparing their performances on a shared test set of real images. Specifically, we measure the difference in mean Intersection over Union (mIoU) and determine the effective number of real images to match synthetic data's training performance. We design a baseline procedural model as a benchmark and provide 21,290 synthetically generated images. By tuning the procedural model, key factors are identified which significantly influence the model's fidelity in replicating real-world scenarios. Importantly, we highlight the challenge of procedural modeling using current techniques, especially in their ability to replicate the spatial semantics of real-world scenarios. This insight is critical because of the potential of procedural models to bridge to hidden scene aspects such as depth, reflectivity, material properties, and lighting conditions.