Learning a Dynamic Map of Visual Appearance
This work addresses the problem of understanding the dynamic visual appearance of the world for computer vision and mapping applications, offering a new way to leverage existing image data.
The paper proposes a global-scale, dynamic map of visual appearance attributes using billions of geotagged and timestamped images. This map allows for fine-grained understanding of expected appearance at any location and time, supporting applications like image-driven mapping and geolocalization.
The appearance of the world varies dramatically not only from place to place but also from hour to hour and month to month. Every day billions of images capture this complex relationship, many of which are associated with precise time and location metadata. We propose to use these images to construct a global-scale, dynamic map of visual appearance attributes. Such a map enables fine-grained understanding of the expected appearance at any geographic location and time. Our approach integrates dense overhead imagery with location and time metadata into a general framework capable of mapping a wide variety of visual attributes. A key feature of our approach is that it requires no manual data annotation. We demonstrate how this approach can support various applications, including image-driven mapping, image geolocalization, and metadata verification.