Cross-view image synthesis using geometry-guided conditional GANs
This addresses a challenging cross-view image synthesis problem for computer vision applications, offering an incremental improvement over existing methods.
The paper tackles the problem of generating realistic images across ground and aerial views by preserving pixel information through geometry-guided mapping and inpainting, demonstrating that this approach adds fine details and outperforms purely pixel-based methods.
We address the problem of generating images across two drastically different views, namely ground (street) and aerial (overhead) views. Image synthesis by itself is a very challenging computer vision task and is even more so when generation is conditioned on an image in another view. Due the difference in viewpoints, there is small overlapping field of view and little common content between these two views. Here, we try to preserve the pixel information between the views so that the generated image is a realistic representation of cross view input image. For this, we propose to use homography as a guide to map the images between the views based on the common field of view to preserve the details in the input image. We then use generative adversarial networks to inpaint the missing regions in the transformed image and add realism to it. Our exhaustive evaluation and model comparison demonstrate that utilizing geometry constraints adds fine details to the generated images and can be a better approach for cross view image synthesis than purely pixel based synthesis methods.