CV LGMay 7, 2019

Learning to Interpret Satellite Images in Global Scale Using Wikipedia

Burak Uzkent, Evan Sheehan, Chenlin Meng, Zhongyi Tang, Marshall Burke, David Lobell, Stefano Ermon

arXiv:1905.02506v311.031 citationsh-index: 113

Originality Incremental advance

AI Analysis

This work addresses the problem of limited labeled data for satellite image analysis, benefiting researchers and practitioners in remote sensing and computer vision, though it is incremental as it builds on existing pre-training methods.

The authors tackled the challenge of fine-grained satellite image interpretation by creating WikiSatNet, a dataset pairing georeferenced Wikipedia articles with satellite imagery, and proposed strategies to learn image representations by predicting article properties, reducing the need for human-annotated labels. They achieved a performance boost of up to 4.5% in F1 score on the fMoW dataset compared to ImageNet pre-training.

Despite recent progress in computer vision, finegrained interpretation of satellite images remains challenging because of a lack of labeled training data. To overcome this limitation, we construct a novel dataset called WikiSatNet by pairing georeferenced Wikipedia articles with satellite imagery of their corresponding locations. We then propose two strategies to learn representations of satellite images by predicting properties of the corresponding articles from the images. Leveraging this new multi-modal dataset, we can drastically reduce the quantity of human-annotated labels and time required for downstream tasks. On the recently released fMoW dataset, our pre-training strategies can boost the performance of a model pre-trained on ImageNet by up to 4:5% in F1 score.

View on arXiv PDF

Similar