Tile2Vec: Unsupervised representation learning for spatially distributed data
This addresses a gap in geospatial analysis for researchers and practitioners, offering a novel approach that is incremental in adapting existing NLP concepts to a new domain.
The paper tackles the lack of unsupervised representation learning methods for geospatial data by introducing Tile2Vec, which extends the distributional hypothesis to spatial data, showing improved performance in downstream classification tasks and enabling visual analogies in the latent space.
Geospatial analysis lacks methods like the word vector representations and pre-trained networks that significantly boost performance across a wide range of natural language and computer vision tasks. To fill this gap, we introduce Tile2Vec, an unsupervised representation learning algorithm that extends the distributional hypothesis from natural language -- words appearing in similar contexts tend to have similar meanings -- to spatially distributed data. We demonstrate empirically that Tile2Vec learns semantically meaningful representations on three datasets. Our learned representations significantly improve performance in downstream classification tasks and, similar to word vectors, visual analogies can be obtained via simple arithmetic in the latent space.