Semantic Trajectory Data Mining with LLM-Informed POI Classification
This work addresses the challenge of incomplete POI data for transportation systems, offering a solution to enhance route optimization and traffic management, though it is incremental as it builds on existing trajectory mining methods.
The paper tackles the problem of human travel trajectory mining by integrating semantic information from Points of Interest (POI) using a novel pipeline that leverages large language models for POI classification and a Bayesian-based algorithm for activity inference, achieving 93.4% accuracy in POI classification and 91.7% accuracy in activity inference.
Human travel trajectory mining is crucial for transportation systems, enhancing route optimization, traffic management, and the study of human travel patterns. Previous rule-based approaches without the integration of semantic information show a limitation in both efficiency and accuracy. Semantic information, such as activity types inferred from Points of Interest (POI) data, can significantly enhance the quality of trajectory mining. However, integrating these insights is challenging, as many POIs have incomplete feature information, and current learning-based POI algorithms require the integrity of datasets to do the classification. In this paper, we introduce a novel pipeline for human travel trajectory mining. Our approach first leverages the strong inferential and comprehension capabilities of large language models (LLMs) to annotate POI with activity types and then uses a Bayesian-based algorithm to infer activity for each stay point in a trajectory. In our evaluation using the OpenStreetMap (OSM) POI dataset, our approach achieves a 93.4% accuracy and a 96.1% F-1 score in POI classification, and a 91.7% accuracy with a 92.3% F-1 score in activity inference.