CV LG MLJun 3, 2019

Y-GAN: A Generative Adversarial Network for Depthmap Estimation from Multi-camera Stereo Images

arXiv:1906.00932v10.9

Originality Incremental advance

AI Analysis

This work addresses depth estimation for autonomous robotics, but it appears incremental as it builds on existing GAN and multi-camera approaches.

The paper tackles the problem of depth perception for autonomous systems by proposing Y-GAN, a generative adversarial network that estimates depth maps from multi-camera stereo images, addressing issues like hardware cost and lack of ground truth data.

Depth perception is a key component for autonomous systems that interact in the real world, such as delivery robots, warehouse robots, and self-driving cars. Tasks in autonomous robotics such as 3D object recognition, simultaneous localization and mapping (SLAM), path planning and navigation, require some form of 3D spatial information. Depth perception is a long-standing research problem in computer vision and robotics and has had a long history. Many approaches using deep learning, ranging from structure from motion, shape-from-X, monocular, binocular, and multi-view stereo, have yielded acceptable results. However, there are several shortcomings of these methods such as requiring expensive hardware, needing supervised training data, no ground truth data for comparison, and disregard for occlusion. In order to address these shortcomings, this work proposes a new deep convolutional generative adversarial network architecture, called Y-GAN, that uses data from three cameras to estimate a depth map for each frame in a multi-camera video stream.

View on arXiv PDF

Similar