CVJan 31, 2022

Reducing the Amount of Real World Data for Object Detector Training with Synthetic Data

Sven Burdorf, Karoline Plum, Daniel Hasenklever

arXiv:2202.00632v13.76 citations

Originality Incremental advance

AI Analysis

This addresses the data scarcity issue for computer vision practitioners by enabling more efficient training with synthetic data, though it is incremental as it builds on existing mixed dataset approaches.

The study tackled the problem of reducing real-world data requirements for training object detectors by using synthetic data, finding that up to 70% of real-world data can be saved without sacrificing detection performance, with optimal real-world data ratios between 5% and 20%.

A number of studies have investigated the training of neural networks with synthetic data for applications in the real world. The aim of this study is to quantify how much real world data can be saved when using a mixed dataset of synthetic and real world data. By modeling the relationship between the number of training examples and detection performance by a simple power law, we find that the need for real world data can be reduced by up to 70% without sacrificing detection performance. The training of object detection networks is especially enhanced by enriching the mixed dataset with classes underrepresented in the real world dataset. The results indicate that mixed datasets with real world data ratios between 5% and 20% reduce the need for real world data the most without reducing the detection performance.

View on arXiv PDF

Similar