Semi-supervised Viewpoint Estimation with Geometry-aware Conditional Generation
This addresses the challenge of expensive viewpoint annotation in computer vision, offering a solution for applications with limited supervision.
The paper tackles the problem of predicting camera viewpoints with limited labeled data by proposing a semi-supervised method that learns from unlabeled image pairs, showing significant improvements over supervised techniques and outperforming state-of-the-art semi-supervised methods.
There is a growing interest in developing computer vision methods that can learn from limited supervision. In this paper, we consider the problem of learning to predict camera viewpoints, where obtaining ground-truth annotations are expensive and require special equipment, from a limited number of labeled images. We propose a semi-supervised viewpoint estimation method that can learn to infer viewpoint information from unlabeled image pairs, where two images differ by a viewpoint change. In particular our method learns to synthesize the second image by combining the appearance from the first one and viewpoint from the second one. We demonstrate that our method significantly improves the supervised techniques, especially in the low-label regime and outperforms the state-of-the-art semi-supervised methods.