CVLGJul 13, 2020

AI Playground: Unreal Engine-based Data Ablation Tool for Deep Learning

arXiv:2007.06153v122 citationsHas Code
AI Analysis

This tool addresses the problem of data scarcity and inflexibility for researchers and practitioners in deep learning, though it is incremental as it builds on existing synthetic data generation methods.

The paper tackles the challenge of acquiring and altering real-world data for machine learning by introducing AI Playground (AIP), an open-source tool using Unreal Engine to generate virtual image data with varying conditions like lighting and fidelity, and validated it by training deep neural networks that achieved results comparable to real data on depth estimation tasks.

Machine learning requires data, but acquiring and labeling real-world data is challenging, expensive, and time-consuming. More importantly, it is nearly impossible to alter real data post-acquisition (e.g., change the illumination of a room), making it very difficult to measure how specific properties of the data affect performance. In this paper, we present AI Playground (AIP), an open-source, Unreal Engine-based tool for generating and labeling virtual image data. With AIP, it is trivial to capture the same image under different conditions (e.g., fidelity, lighting, etc.) and with different ground truths (e.g., depth or surface normal values). AIP is easily extendable and can be used with or without code. To validate our proposed tool, we generated eight datasets of otherwise identical but varying lighting and fidelity conditions. We then trained deep neural networks to predict (1) depth values, (2) surface normals, or (3) object labels and assessed each network's intra- and cross-dataset performance. Among other insights, we verified that sensitivity to different settings is problem-dependent. We confirmed the findings of other studies that segmentation models are very sensitive to fidelity, but we also found that they are just as sensitive to lighting. In contrast, depth and normal estimation models seem to be less sensitive to fidelity or lighting and more sensitive to the structure of the image. Finally, we tested our trained depth-estimation networks on two real-world datasets and obtained results comparable to training on real data alone, confirming that our virtual environments are realistic enough for real-world tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes