AI LGOct 2, 2023

Pre-training Contextual Location Embeddings in Personal Trajectories via Efficient Hierarchical Location Representations

Chung Park, Taesan Kim, Junui Hong, Minsung Choi, Jaegul Choo

arXiv:2310.01252v16.75 citationsh-index: 44

Originality Highly original

AI Analysis

This work addresses a scalability bottleneck in location-based services for real-world applications with fine-grained or extensive regions.

The paper tackles the problem of expensive location embedding pre-training due to large numbers of locations by proposing a Geo-Tokenizer that reduces vocabulary size through hierarchical grid representations, and it shows significant performance improvements in downstream tasks with fewer parameters compared to existing methods.

Pre-training the embedding of a location generated from human mobility data has become a popular method for location based services. In practice, modeling the location embedding is too expensive, due to the large number of locations to be trained in situations with fine-grained resolution or extensive target regions. Previous studies have handled less than ten thousand distinct locations, which is insufficient in the real-world applications. To tackle this problem, we propose a Geo-Tokenizer, designed to efficiently reduce the number of locations to be trained by representing a location as a combination of several grids at different scales. In the Geo-Tokenizer, a grid at a larger scale shares the common set of grids at smaller scales, which is a key factor in reducing the size of the location vocabulary. The sequences of locations preprocessed with the Geo-Tokenizer are utilized by a causal location embedding model to capture the temporal dependencies of locations. This model dynamically calculates the embedding vector of a target location, which varies depending on its trajectory. In addition, to efficiently pre-train the location embedding model, we propose the Hierarchical Auto-regressive Location Model objective to effectively train decomposed locations in the Geo-Tokenizer. We conducted experiments on two real-world user trajectory datasets using our pre-trained location model. The experimental results show that our model significantly improves the performance of downstream tasks with fewer model parameters compared to existing location embedding methods.

View on arXiv PDF

Similar