Deep Learning based Novel View Synthesis
This work addresses the challenge of novel view synthesis for computer vision applications, but it is incremental as it builds on existing deep learning approaches with specific improvements like handling variable input counts and multi-resolution analysis.
The authors tackled the problem of predicting novel views of a scene from real-world images by proposing a deep CNN that works with varying numbers of input images, unlike prior methods limited to fixed inputs, and achieved competitive performance on different datasets.
Predicting novel views of a scene from real-world images has always been a challenging task. In this work, we propose a deep convolutional neural network (CNN) which learns to predict novel views of a scene from given collection of images. In comparison to prior deep learning based approaches, which can handle only a fixed number of input images to predict novel view, proposed approach works with different numbers of input images. The proposed model explicitly performs feature extraction and matching from a given pair of input images and estimates, at each pixel, the probability distribution (pdf) over possible depth levels in the scene. This pdf is then used for estimating the novel view. The model estimates multiple predictions of novel view, one estimate per input image pair, from given image collection. The model also estimates an occlusion mask and combines multiple novel view estimates in to a single optimal prediction. The finite number of depth levels used in the analysis may cause occasional blurriness in the estimated view. We mitigate this issue with simple multi-resolution analysis which improves the quality of the estimates. We substantiate the performance on different datasets and show competitive performance.