A Unified Model for Near and Remote Sensing
This work addresses the challenge of integrating multi-source imagery for urban analysis, offering a domain-specific solution that is incremental in its hybrid approach.
The authors tackled the problem of estimating geospatial functions like population density and land use by combining overhead and ground-level images in a unified convolutional neural network, achieving higher accuracy across tasks, sometimes dramatically, as validated on a new urban dataset.
We propose a novel convolutional neural network architecture for estimating geospatial functions such as population density, land cover, or land use. In our approach, we combine overhead and ground-level images in an end-to-end trainable neural network, which uses kernel regression and density estimation to convert features extracted from the ground-level images into a dense feature map. The output of this network is a dense estimate of the geospatial function in the form of a pixel-level labeling of the overhead image. To evaluate our approach, we created a large dataset of overhead and ground-level images from a major urban area with three sets of labels: land use, building function, and building age. We find that our approach is more accurate for all tasks, in some cases dramatically so.