Gaussian Process Priors for View-Aware Inference
This work addresses the gap between probabilistic theory and practice in vision tasks for researchers and practitioners, but it appears incremental as it builds on existing probabilistic methods.
The authors tackled the problem of underutilizing correlations between frames in computer vision by developing a principled framework that combines camera pose information with deep models using a novel view kernel. They demonstrated that this approach improves pose-related tasks like novel view synthesis and latent space predictions, though no concrete numbers were provided.
While frame-independent predictions with deep neural networks have become the prominent solutions to many computer vision tasks, the potential benefits of utilizing correlations between frames have received less attention. Even though probabilistic machine learning provides the ability to encode correlation as prior knowledge for inference, there is a tangible gap between the theory and practice of applying probabilistic methods to modern vision problems. For this, we derive a principled framework to combine information coupling between camera poses (translation and orientation) with deep models. We proposed a novel view kernel that generalizes the standard periodic kernel in $\mathrm{SO}(3)$. We show how this soft-prior knowledge can aid several pose-related vision tasks like novel view synthesis and predict arbitrary points in the latent space of generative models, pointing towards a range of new applications for inter-frame reasoning.