Multi-View Frame Reconstruction with Conditional GAN
This addresses frame reconstruction in multi-camera systems, particularly for surveillance or video analysis, but appears incremental as it builds on existing cGAN methods with a weighted fusion technique.
The paper tackles multi-view frame reconstruction when multiple frames are missing and temporal gaps are large, using a conditional GAN approach with weighted averaging of inputs from within-camera and overlapping-camera frames. Experiments on two datasets show comparable results to state-of-the-art in single-camera scenarios and promising performance in multi-camera settings.
Multi-view frame reconstruction is an important problem particularly when multiple frames are missing and past and future frames within the camera are far apart from the missing ones. Realistic coherent frames can still be reconstructed using corresponding frames from other overlapping cameras. We propose an adversarial approach to learn the spatio-temporal representation of the missing frame using conditional Generative Adversarial Network (cGAN). The conditional input to each cGAN is the preceding or following frames within the camera or the corresponding frames in other overlapping cameras, all of which are merged together using a weighted average. Representations learned from frames within the camera are given more weight compared to the ones learned from other cameras when they are close to the missing frames and vice versa. Experiments on two challenging datasets demonstrate that our framework produces comparable results with the state-of-the-art reconstruction method in a single camera and achieves promising performance in multi-camera scenario.