CV LG IVMar 24, 2020

On Localizing a Camera from a Single Image

Pradipta Ghosh, Xiaochen Liu, Hang Qiu, Marcos A. M. Vieira, Gaurav S. Sukhatme, Ramesh Govindan

arXiv:2003.10664v12.31 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of localizing public cameras with limited metadata, enabling precise event tracking, though it appears incremental as it builds on existing methods like neural networks and projective geometry.

The paper tackles the problem of estimating a camera's location from a single image, achieving 95% of images localized within 12 meters, which is two orders of magnitude better than PoseNet.

Public cameras often have limited metadata describing their attributes. A key missing attribute is the precise location of the camera, using which it is possible to precisely pinpoint the location of events seen in the camera. In this paper, we explore the following question: under what conditions is it possible to estimate the location of a camera from a single image taken by the camera? We show that, using a judicious combination of projective geometry, neural networks, and crowd-sourced annotations from human workers, it is possible to position 95% of the images in our test data set to within 12 m. This performance is two orders of magnitude better than PoseNet, a state-of-the-art neural network that, when trained on a large corpus of images in an area, can estimate the pose of a single image. Finally, we show that the camera's inferred position and intrinsic parameters can help design a number of virtual sensors, all of which are reasonably accurate.

View on arXiv PDF

Similar