CVGRJul 31, 2020

Photorealism in Driving Simulations: Blending Generative Adversarial Image Synthesis with Rendering

arXiv:2007.15820v28 citations
AI Analysis

This work addresses the need for photorealistic imagery in driving simulations to enhance vision-based algorithms and human driver experiments, representing an incremental improvement over existing rendering techniques.

The paper tackles the problem of low visual fidelity in driving simulators by introducing a hybrid generative neural graphics pipeline that blends partially rendered objects with GAN-synthesized backgrounds, achieving improved photorealism measured by semantic retention analysis and Frechet Inception Distance (FID) compared to conventional methods.

Driving simulators play a large role in developing and testing new intelligent vehicle systems. The visual fidelity of the simulation is critical for building vision-based algorithms and conducting human driver experiments. Low visual fidelity breaks immersion for human-in-the-loop driving experiments. Conventional computer graphics pipelines use detailed 3D models, meshes, textures, and rendering engines to generate 2D images from 3D scenes. These processes are labor-intensive, and they do not generate photorealistic imagery. Here we introduce a hybrid generative neural graphics pipeline for improving the visual fidelity of driving simulations. Given a 3D scene, we partially render only important objects of interest, such as vehicles, and use generative adversarial processes to synthesize the background and the rest of the image. To this end, we propose a novel image formation strategy to form 2D semantic images from 3D scenery consisting of simple object models without textures. These semantic images are then converted into photorealistic RGB images with a state-of-the-art Generative Adversarial Network (GAN) trained on real-world driving scenes. This replaces repetitiveness with randomly generated but photorealistic surfaces. Finally, the partially-rendered and GAN synthesized images are blended with a blending GAN. We show that the photorealism of images generated with the proposed method is more similar to real-world driving datasets such as Cityscapes and KITTI than conventional approaches. This comparison is made using semantic retention analysis and Frechet Inception Distance (FID) measurements.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes