CVAug 9, 2024Code
TrajFM: A Vehicle Trajectory Foundation Model for Region and Task TransferabilityYan Lin, Tonglong Wei, Zeyu Zhou et al.
Vehicle trajectories provide valuable movement information that supports various downstream tasks and powers real-world applications. A desirable trajectory learning model should transfer between different regions and tasks without retraining, thus improving computational efficiency and effectiveness with limited training data. However, a model's ability to transfer across regions is limited by the unique spatial features and POI arrangements of each region, which are closely linked to vehicle movement patterns and difficult to generalize. Additionally, achieving task transferability is challenging due to the differing generation schemes required for various tasks. Existing efforts towards transferability primarily involve learning embedding vectors for trajectories, which perform poorly in region transfer and still require retraining of prediction modules for task transfer. To address these challenges, we propose TrajFM, a vehicle trajectory foundation model that excels in both region and task transferability. For region transferability, we introduce STRFormer as the main learnable model within TrajFM. It integrates spatial, temporal, and POI modalities of trajectories to effectively manage variations in POI arrangements across regions and includes a learnable spatio-temporal Rotary position embedding module for handling spatial features. For task transferability, we propose a trajectory masking and recovery scheme. This scheme unifies the generation processes of various tasks into the masking and recovery of modalities and sub-trajectories, allowing TrajFM to be pre-trained once and transferred to different tasks without retraining. Experiments on two real-world vehicle trajectory datasets under various settings demonstrate the effectiveness of TrajFM. Code is available at https://anonymous.4open.science/r/TrajFM-30E4.
LGJul 12, 2024Code
STD-PLM: Understanding Both Spatial and Temporal Properties of Spatial-Temporal Data with PLMYiHeng Huang, Xiaowei Mao, Shengnan Guo et al.
Spatial-temporal forecasting and imputation are important for real-world intelligent systems. Most existing methods are tailored for individual forecasting or imputation tasks but are not designed for both. Additionally, they are less effective for zero-shot and few-shot learning. While pre-trained language model (PLM) have exhibited strong pattern recognition and reasoning abilities across various tasks, including few-shot and zero-shot learning, their applications in spatial-temporal data understanding has been constrained by insufficient modeling of complex correlations such as the temporal correlations, spatial connectivity, non-pairwise and high-order spatial-temporal correlations within data. In this paper, we propose STD-PLM for understanding both spatial and temporal properties of \underline{S}patial-\underline{T}emporal \underline{D}ata with \underline{PLM}, which is capable of implementing both spatial-temporal forecasting and imputation tasks. STD-PLM understands spatial-temporal correlations via explicitly designed spatial and temporal tokenizers. Topology-aware node embeddings are designed for PLM to comprehend and exploit the topology structure of data in inductive manner. Furthermore, to mitigate the efficiency issues introduced by the PLM, we design a sandglass attention module (SGA) combined with a specific constrained loss function, which significantly improves the model's efficiency while ensuring performance. Extensive experiments demonstrate that STD-PLM exhibits competitive performance and generalization capabilities across the forecasting and imputation tasks on various datasets. Moreover, STD-PLM achieves promising results on both few-shot and zero-shot tasks. The code is made available at \href{https://github.com/Hyheng/STD-PLM}{https://github.com/Hyheng/STD-PLM}
LGJul 6, 2023
Origin-Destination Travel Time Oracle for Map-based ServicesYan Lin, Huaiyu Wan, Jilin Hu et al.
Given an origin (O), a destination (D), and a departure time (T), an Origin-Destination (OD) travel time oracle~(ODT-Oracle) returns an estimate of the time it takes to travel from O to D when departing at T. ODT-Oracles serve important purposes in map-based services. To enable the construction of such oracles, we provide a travel-time estimation (TTE) solution that leverages historical trajectories to estimate time-varying travel times for OD pairs. The problem is complicated by the fact that multiple historical trajectories with different travel times may connect an OD pair, while trajectories may vary from one another. To solve the problem, it is crucial to remove outlier trajectories when doing travel time estimation for future queries. We propose a novel, two-stage framework called Diffusion-based Origin-destination Travel Time Estimation (DOT), that solves the problem. First, DOT employs a conditioned Pixelated Trajectories (PiT) denoiser that enables building a diffusion-based PiT inference process by learning correlations between OD pairs and historical trajectories. Specifically, given an OD pair and a departure time, we aim to infer a PiT. Next, DOT encompasses a Masked Vision Transformer~(MViT) that effectively and efficiently estimates a travel time based on the inferred PiT. We report on extensive experiments on two real-world datasets that offer evidence that DOT is capable of outperforming baseline methods in terms of accuracy, scalability, and explainability.
LGAug 9, 2024Code
PTrajM: Efficient and Semantic-rich Trajectory Learning with Pretrained Trajectory-MambaYan Lin, Yichen Liu, Zeyu Zhou et al.
Vehicle trajectories provide crucial movement information for various real-world applications. To better utilize vehicle trajectories, it is essential to develop a trajectory learning approach that can effectively and efficiently extract rich semantic information, including movement behavior and travel purposes, to support accurate downstream applications. However, creating such an approach presents two significant challenges. First, movement behavior are inherently spatio-temporally continuous, making them difficult to extract efficiently from irregular and discrete trajectory points. Second, travel purposes are related to the functionalities of areas and road segments traversed by vehicles. These functionalities are not available from the raw spatio-temporal trajectory features and are hard to extract directly from complex textual features associated with these areas and road segments. To address these challenges, we propose PTrajM, a novel method capable of efficient and semantic-rich vehicle trajectory learning. To support efficient modeling of movement behavior, we introduce Trajectory-Mamba as the learnable model of PTrajM, which effectively extracts continuous movement behavior while being more computationally efficient than existing structures. To facilitate efficient extraction of travel purposes, we propose a travel purpose-aware pre-training procedure, which enables PTrajM to discern the travel purposes of trajectories without additional computational resources during its embedding process. Extensive experiments on two real-world datasets and comparisons with several state-of-the-art trajectory learning methods demonstrate the effectiveness of PTrajM. Code is available at https://anonymous.4open.science/r/PTrajM-C973.
LGJul 17, 2024Code
UniTE: A Survey and Unified Pipeline for Pre-training Spatiotemporal Trajectory EmbeddingsYan Lin, Zeyu Zhou, Yicheng Liu et al.
Spatiotemporal trajectories are sequences of timestamped locations, which enable a variety of analyses that in turn enable important real-world applications. It is common to map trajectories to vectors, called embeddings, before subsequent analyses. Thus, the qualities of embeddings are very important. Methods for pre-training embeddings, which leverage unlabeled trajectories for training universal embeddings, have shown promising applicability across different tasks, thus attracting considerable interest. However, research progress on this topic faces two key challenges: a lack of a comprehensive overview of existing methods, resulting in several related methods not being well-recognized, and the absence of a unified pipeline, complicating the development of new methods and the analysis of methods. We present UniTE, a survey and a unified pipeline for this domain. In doing so, we present a comprehensive list of existing methods for pre-training trajectory embeddings, which includes methods that either explicitly or implicitly employ pre-training techniques. Further, we present a unified and modular pipeline with publicly available underlying code, simplifying the process of constructing and evaluating methods for pre-training trajectory embeddings. Additionally, we contribute a selection of experimental results using the proposed pipeline on real-world datasets. Implementation of the pipeline is publicly available at https://github.com/Logan-Lin/UniTE.
CVJul 29, 2022
Pre-training General Trajectory Embeddings with Maximum Multi-view Entropy CodingYan Lin, Huaiyu Wan, Shengnan Guo et al.
Spatio-temporal trajectories provide valuable information about movement and travel behavior, enabling various downstream tasks that in turn power real-world applications. Learning trajectory embeddings can improve task performance but may incur high computational costs and face limited training data availability. Pre-training learns generic embeddings by means of specially constructed pretext tasks that enable learning from unlabeled data. Existing pre-training methods face (i) difficulties in learning general embeddings due to biases towards certain downstream tasks incurred by the pretext tasks, (ii) limitations in capturing both travel semantics and spatio-temporal correlations, and (iii) the complexity of long, irregularly sampled trajectories. To tackle these challenges, we propose Maximum Multi-view Trajectory Entropy Coding (MMTEC) for learning general and comprehensive trajectory embeddings. We introduce a pretext task that reduces biases in pre-trained trajectory embeddings, yielding embeddings that are useful for a wide variety of downstream tasks. We also propose an attention-based discrete encoder and a NeuralCDE-based continuous encoder that extract and represent travel behavior and continuous spatio-temporal correlations from trajectories in embeddings, respectively. Extensive experiments on two real-world datasets and three downstream tasks offer insight into the design properties of our proposal and indicate that it is capable of outperforming existing trajectory embedding methods.
87.1CLApr 21Code
SAMoRA: Semantic-Aware Mixture of LoRA Experts for Task-Adaptive LearningBoyan Shi, Wei Chen, Shuyuan Zhao et al.
The combination of Mixture-of-Experts (MoE) and Low-Rank Adaptation (LoRA) has shown significant potential for enhancing the multi-task learning capabilities of Large Language Models. However, existing methods face two primary challenges: (1)Imprecise Routing in the current MoE-LoRA method fails to explicitly match input semantics with expert capabilities, leading to weak expert specialization. (2)Uniform weight fusion strategies struggle to provide adaptive update strengths, overlooking the varying complexity of different tasks. To address these limitations, we propose SAMoRA (Semantic-Aware Mixture of LoRA Experts), a novel parameter-efficient fine-tuning framework tailored for task-adaptive learning. Specifically, A Semantic-Aware Router is proposed to explicitly align textual semantics with the most suitable experts for precise routing. A Task-Adaptive Scaling mechanism is designed to regulate expert contributions based on specific task requirements dynamically. In addition, a novel regularization objective is proposed to jointly promote expert specialization and effective scaling. Extensive experiments on multiple multi-task benchmarks demonstrate that SAMoRA significantly outperforms the state-of-the-art methods and holds excellent task generalization capabilities. Code is available at https://github.com/boyan-code/SAMoRA
62.5IRApr 21Code
STK-Adapter: Incorporating Evolving Graph and Event Chain for Temporal Knowledge Graph ExtrapolationShuyuan Zhao, Wei Chen, Weijie Zhang et al.
Temporal Knowledge Graph (TKG) extrapolation aims to predict future events based on historical facts. Recent studies have attempted to enhance TKG extrapolation by integrating TKG's evolving structural representations and textual event chains into Large Language Models (LLMs). Yet, two main challenges limit these approaches: (1) The loss of essential spatial-temporal information due to shallow alignment between TKG's graph evolving structural representation and the LLM's semantic space, and (2) the progressive dilution of the TKG's evolving structural features during LLM fine-tuning. To address these challenges, we propose the Spatial-Temporal Knowledge Adapter (STK-Adapter), which flexibly integrates the evolving graph encoder and the LLM to facilitate TKG reasoning. In STK-Adapter, a Spatial-Temporal MoE is designed to capture spatial structures and temporal patterns inherent in TKGs. An Event-Aware MoE is employed to model intricate temporal semantics dependencies within event chains. In addition, a Cross-Modality Alignment MoE is proposed to facilitate deep cross-modality alignment by TKG-guided attention experts. Extensive experiments on benchmark datasets demonstrate that STK-Adapter significantly outperforms state-of-the-art methods and exhibits strong generalization capabilities in cross-dataset task. The code is available at https://github.com/Zhaoshuyuan0246/STK-Adapter.
AIAug 23, 2024
DutyTTE: Deciphering Uncertainty in Origin-Destination Travel Time EstimationXiaowei Mao, Yan Lin, Shengnan Guo et al.
Uncertainty quantification in travel time estimation (TTE) aims to estimate the confidence interval for travel time, given the origin (O), destination (D), and departure time (T). Accurately quantifying this uncertainty requires generating the most likely path and assessing travel time uncertainty along the path. This involves two main challenges: 1) Predicting a path that aligns with the ground truth, and 2) modeling the impact of travel time in each segment on overall uncertainty under varying conditions. We propose DutyTTE to address these challenges. For the first challenge, we introduce a deep reinforcement learning method to improve alignment between the predicted path and the ground truth, providing more accurate travel time information from road segments to improve TTE. For the second challenge, we propose a mixture of experts guided uncertainty quantification mechanism to better capture travel time uncertainty for each segment under varying contexts. Additionally, we calibrate our results using Hoeffding's upper-confidence bound to provide statistical guarantees for the estimated confidence intervals. Extensive experiments on two real-world datasets demonstrate the superiority of our proposed method.
LGJul 22, 2024
Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation LearningLetian Gong, Huaiyu Wan, Shengnan Guo et al.
The rapid growth of location-based services (LBS) has yielded massive amounts of data on human mobility. Effectively extracting meaningful representations for user-generated check-in sequences is pivotal for facilitating various downstream services. However, the user-generated check-in data are simultaneously influenced by the surrounding objective circumstances and the user's subjective intention. Specifically, the temporal uncertainty and spatial diversity exhibited in check-in data make it difficult to capture the macroscopic spatial-temporal patterns of users and to understand the semantics of user mobility activities. Furthermore, the distinct characteristics of the temporal and spatial information in check-in sequences call for an effective fusion method to incorporate these two types of information. In this paper, we propose a novel Spatial-Temporal Cross-view Contrastive Representation (STCCR) framework for check-in sequence representation learning. Specifically, STCCR addresses the above challenges by employing self-supervision from "spatial topic" and "temporal intention" views, facilitating effective fusion of spatial and temporal information at the semantic level. Besides, STCCR leverages contrastive clustering to uncover users' shared spatial topics from diverse mobility activities, while employing angular momentum contrast to mitigate the impact of temporal uncertainty and noise. We extensively evaluate STCCR on three real-world datasets and demonstrate its superior performance across three downstream tasks.
LGJan 8
Spatial-Temporal Feedback Diffusion Guidance for Controlled Traffic ImputationXiaowei Mao, Huihu Ding, Yan Lin et al.
Imputing missing values in spatial-temporal traffic data is essential for intelligent transportation systems. Among advanced imputation methods, score-based diffusion models have demonstrated competitive performance. These models generate data by reversing a noising process, using observed values as conditional guidance. However, existing diffusion models typically apply a uniform guidance scale across both spatial and temporal dimensions, which is inadequate for nodes with high missing data rates. Sparse observations provide insufficient conditional guidance, causing the generative process to drift toward the learned prior distribution rather than closely following the conditional observations, resulting in suboptimal imputation performance. To address this, we propose FENCE, a spatial-temporal feedback diffusion guidance method designed to adaptively control guidance scales during imputation. First, FENCE introduces a dynamic feedback mechanism that adjusts the guidance scale based on the posterior likelihood approximations. The guidance scale is increased when generated values diverge from observations and reduced when alignment improves, preventing overcorrection. Second, because alignment to observations varies across nodes and denoising steps, a global guidance scale for all nodes is suboptimal. FENCE computes guidance scales at the cluster level by grouping nodes based on their attention scores, leveraging spatial-temporal correlations to provide more accurate guidance. Experimental results on real-world traffic datasets show that FENCE significantly enhances imputation accuracy.
LGDec 25, 2025
RIPCN: A Road Impedance Principal Component Network for Probabilistic Traffic Flow ForecastingHaochen Lv, Yan Lin, Shengnan Guo et al.
Accurate traffic flow forecasting is crucial for intelligent transportation services such as navigation and ride-hailing. In such applications, uncertainty estimation in forecasting is important because it helps evaluate traffic risk levels, assess forecast reliability, and provide timely warnings. As a result, probabilistic traffic flow forecasting (PTFF) has gained significant attention, as it produces both point forecasts and uncertainty estimates. However, existing PTFF approaches still face two key challenges: (1) how to uncover and model the causes of traffic flow uncertainty for reliable forecasting, and (2) how to capture the spatiotemporal correlations of uncertainty for accurate prediction. To address these challenges, we propose RIPCN, a Road Impedance Principal Component Network that integrates domain-specific transportation theory with spatiotemporal principal component learning for PTFF. RIPCN introduces a dynamic impedance evolution network that captures directional traffic transfer patterns driven by road congestion level and flow variability, revealing the direct causes of uncertainty and enhancing both reliability and interpretability. In addition, a principal component network is designed to forecast the dominant eigenvectors of future flow covariance, enabling the model to capture spatiotemporal uncertainty correlations. This design allows for accurate and efficient uncertainty estimation while also improving point prediction performance. Experimental results on real-world datasets show that our approach outperforms existing probabilistic forecasting methods.
SYOct 30, 2024Code
DiffLight: A Partial Rewards Conditioned Diffusion Model for Traffic Signal Control with Missing DataHanyang Chen, Yang Jiang, Shengnan Guo et al.
The application of reinforcement learning in traffic signal control (TSC) has been extensively researched and yielded notable achievements. However, most existing works for TSC assume that traffic data from all surrounding intersections is fully and continuously available through sensors. In real-world applications, this assumption often fails due to sensor malfunctions or data loss, making TSC with missing data a critical challenge. To meet the needs of practical applications, we introduce DiffLight, a novel conditional diffusion model for TSC under data-missing scenarios in the offline setting. Specifically, we integrate two essential sub-tasks, i.e., traffic data imputation and decision-making, by leveraging a Partial Rewards Conditioned Diffusion (PRCD) model to prevent missing rewards from interfering with the learning process. Meanwhile, to effectively capture the spatial-temporal dependencies among intersections, we design a Spatial-Temporal transFormer (STFormer) architecture. In addition, we propose a Diffusion Communication Mechanism (DCM) to promote better communication and control performance under data-missing scenarios. Extensive experiments on five datasets with various data-missing scenarios demonstrate that DiffLight is an effective controller to address TSC with missing data. The code of DiffLight is released at https://github.com/lokol5579/DiffLight-release.
AIApr 5, 2025Code
Towards An Efficient and Effective En Route Travel Time Estimation FrameworkZekai Shen, Haitao Yuan, Xiaowei Mao et al.
En route travel time estimation (ER-TTE) focuses on predicting the travel time of the remaining route. Existing ER-TTE methods always make re-estimation which significantly hinders real-time performance, especially when faced with the computational demands of simultaneous user requests. This results in delays and reduced responsiveness in ER-TTE services. We propose a general efficient framework U-ERTTE combining an Uncertainty-Guided Decision mechanism (UGD) and Fine-Tuning with Meta-Learning (FTML) to address these challenges. UGD quantifies the uncertainty and provides confidence intervals for the entire route. It selectively re-estimates only when the actual travel time deviates from the predicted confidence intervals, thereby optimizing the efficiency of ER-TTE. To ensure the accuracy of confidence intervals and accurate predictions that need to re-estimate, FTML is employed to train the model, enabling it to learn general driving patterns and specific features to adapt to specific tasks. Extensive experiments on two large-scale real datasets demonstrate that the U-ERTTE framework significantly enhances inference speed and throughput while maintaining high effectiveness. Our code is available at https://github.com/shenzekai/U-ERTTE
LGFeb 12, 2024
Diff-RNTraj: A Structure-aware Diffusion Model for Road Network-constrained Trajectory GenerationTonglong Wei, Youfang Lin, Shengnan Guo et al.
Trajectory data is essential for various applications as it records the movement of vehicles. However, publicly available trajectory datasets remain limited in scale due to privacy concerns, which hinders the development of trajectory data mining and trajectory-based applications. To address this issue, some methods for generating synthetic trajectories have been proposed to expand the scale of the dataset. However, all existing methods generate trajectories in the geographical coordinate system, which poses two limitations for their utilization in practical applications: 1) the inability to ensure that the generated trajectories are constrained on the road. 2) the lack of road-related information. In this paper, we propose a new problem to meet the practical application need, \emph{i.e.}, road network-constrained trajectory (RNTraj) generation, which can directly generate trajectories on the road network with road-related information. RNTraj is a hybrid type of data, in which each point is represented by a discrete road segment and a continuous moving rate. To generate RNTraj, we design a diffusion model called Diff-RNTraj. This model can effectively handle the hybrid RNTraj using a continuous diffusion framework by incorporating a pre-training strategy to embed hybrid RNTraj into continuous representations. During the sampling stage, a RNTraj decoder is designed to map the continuous representation generated by the diffusion model back to the hybrid RNTraj format. Furthermore, Diff-RNTraj introduces a novel loss function to enhance the spatial validity of the generated trajectories. Extensive experiments conducted on two real-world trajectory datasets demonstrate the effectiveness of the proposed model.
LGOct 29, 2024
Mobility-LLM: Learning Visiting Intentions and Travel Preferences from Human Mobility Data with Large Language ModelsLetian Gong, Yan Lin, Xinyue Zhang et al.
Location-based services (LBS) have accumulated extensive human mobility data on diverse behaviors through check-in sequences. These sequences offer valuable insights into users' intentions and preferences. Yet, existing models analyzing check-in sequences fail to consider the semantics contained in these sequences, which closely reflect human visiting intentions and travel preferences, leading to an incomplete comprehension. Drawing inspiration from the exceptional semantic understanding and contextual information processing capabilities of large language models (LLMs) across various domains, we present Mobility-LLM, a novel framework that leverages LLMs to analyze check-in sequences for multiple tasks. Since LLMs cannot directly interpret check-ins, we reprogram these sequences to help LLMs comprehensively understand the semantics of human visiting intentions and travel preferences. Specifically, we introduce a visiting intention memory network (VIMN) to capture the visiting intentions at each record, along with a shared pool of human travel preference prompts (HTPP) to guide the LLM in understanding users' travel preferences. These components enhance the model's ability to extract and leverage semantic information from human mobility data effectively. Extensive experiments on four benchmark datasets and three downstream tasks demonstrate that our approach significantly outperforms existing models, underscoring the effectiveness of Mobility-LLM in advancing our understanding of human mobility data within LBS contexts.
LGApr 29, 2024
Micro-Macro Spatial-Temporal Graph-based Encoder-Decoder for Map-Constrained Trajectory RecoveryTonglong Wei, Youfang Lin, Yan Lin et al.
Recovering intermediate missing GPS points in a sparse trajectory, while adhering to the constraints of the road network, could offer deep insights into users' moving behaviors in intelligent transportation systems. Although recent studies have demonstrated the advantages of achieving map-constrained trajectory recovery via an end-to-end manner, they still face two significant challenges. Firstly, existing methods are mostly sequence-based models. It is extremely hard for them to comprehensively capture the micro-semantics of individual trajectory, including the information of each GPS point and the movement between two GPS points. Secondly, existing approaches ignore the impact of the macro-semantics, i.e., the road conditions and the people's shared travel preferences reflected by a group of trajectories. To address the above challenges, we propose a Micro-Macro Spatial-Temporal Graph-based Encoder-Decoder (MM-STGED). Specifically, we model each trajectory as a graph to efficiently describe the micro-semantics of trajectory and design a novel message-passing mechanism to learn trajectory representations. Additionally, we extract the macro-semantics of trajectories and further incorporate them into a well-designed graph-based decoder to guide trajectory recovery. Extensive experiments conducted on sparse trajectories with three different sampling intervals that are respectively constructed from two real-world trajectory datasets demonstrate the superiority of our proposed model.
64.6CVApr 26
Zoom In, Reason Out: Efficient Far-field Anomaly Detection in Expressway Surveillance Videos via Focused VLM Reasoning Guided by Bayesian InferenceXiaowei Mao, Bowen Sui, Weijie Zhang et al.
Expressway video anomaly detection is essential for safety management. However, identifying anomalies across diverse scenes remains challenging, particularly for far-field targets exhibiting subtle abnormal vehicle motions. While Vision-Language Models (VLMs) demonstrate strong semantic reasoning capabilities, processing global frames causes attention dilution for these far-field objects and incurs prohibitive computational costs. To address these issues, we propose VIBES, an asynchronous collaborative framework utilizing VLMs guided by Bayesian inference. Specifically, to overcome poor generalization across varying expressway environments, we introduce an online Bayesian inference module. This module continuously evaluates vehicle trajectories to dynamically update the probabilistic boundaries of normal driving behaviors, serving as an asynchronous trigger to precisely localize anomalies in space and time. Instead of processing the continuous video stream, the VLM processes only the localized visual regions indicated by the trigger. This targeted visual input prevents attention dilution and enables accurate semantic reasoning. Extensive evaluations demonstrate that VIBES improves detection accuracy for far-field anomalies and reduces computational overhead, achieving high real-time efficiency and explainability while demonstrating generalization across diverse expressway conditions.
LGMay 21, 2024
TrajCogn: Leveraging LLMs for Cognizing Movement Patterns and Travel Purposes from TrajectoriesZeyu Zhou, Yan Lin, Haomin Wen et al.
Spatio-temporal trajectories are crucial in various data mining tasks. It is important to develop a versatile trajectory learning method that performs different tasks with high accuracy. This involves effectively extracting two core aspects of information--movement patterns and travel purposes--from trajectories. However, this is challenging due to limitations in model capacity and the quality and scale of trajectory datasets. Meanwhile, large language models (LLMs) have shown great success in versatility by training on large-scale, high-quality datasets. Given the similarities between trajectories and sentences, there's potential to leverage LLMs to develop an effective trajectory learning method. However, standard LLMs are not designed to handle the unique spatio-temporal features of trajectories and cannot extract movement patterns and travel purposes. To address these challenges, we propose a model called TrajCogn that effectively utilizes LLMs to model trajectories. TrajCogn leverages the strengths of LLMs to create a versatile trajectory learning approach while addressing the limitations of standard LLMs. First, TrajCogn incorporates a novel trajectory semantic embedder that enables LLMs to process spatio-temporal features and extract movement patterns and travel purposes. Second, TrajCogn introduces a new trajectory prompt that integrates these patterns and purposes into LLMs, allowing the model to adapt to various tasks. Extensive experiments on two real-world datasets and two representative tasks demonstrate that TrajCogn successfully achieves its design goals. Codes are available at https://anonymous.4open.science/r/TrajCogn-5021.
35.3LGMar 31
DiSGMM: A Method for Time-varying Microscopic Weight Completion on Road NetworksYan Lin, Jilin Hu, Shengnan Guo et al.
Microscopic road-network weights represent fine-grained, time-varying traffic conditions obtained from individual vehicles. An example is travel speeds associated with road segments as vehicles traverse them. These weights support tasks including traffic microsimulation and vehicle routing with reliability guarantees. We study the problem of time-varying microscopic weight completion. During a time slot, the available weights typically cover only some road segments. Weight completion recovers distributions for the weights of every road segment at the current time slot. This problem involves two challenges: (i) contending with two layers of sparsity, where weights are missing at both the network layer (many road segments lack weights) and the segment layer (a segment may have insufficient weights to enable accurate distribution estimation); and (ii) achieving a weight distribution representation that is closed-form and can capture complex conditions flexibly, including heavy tails and multiple clusters. To address these challenges, we propose DiSGMM that combines sparsity-aware embeddings with spatiotemporal modeling to leverage sparse known weights alongside learned segment properties and long-range correlations for distribution estimation. DiSGMM represents distributions of microscopic weights as learnable Gaussian mixture models, providing closed-form distributions capable of capturing complex conditions flexibly. Experiments on two real-world datasets show that DiSGMM can outperform state-of-the-art methods.
LGOct 20, 2025
TrajMamba: An Efficient and Semantic-rich Vehicle Trajectory Pre-training ModelYichen Liu, Yan Lin, Shengnan Guo et al.
Vehicle GPS trajectories record how vehicles move over time, storing valuable travel semantics, including movement patterns and travel purposes. Learning travel semantics effectively and efficiently is crucial for real-world applications of trajectory data, which is hindered by two major challenges. First, travel purposes are tied to the functions of the roads and points-of-interest (POIs) involved in a trip. Such information is encoded in textual addresses and descriptions and introduces heavy computational burden to modeling. Second, real-world trajectories often contain redundant points, which harm both computational efficiency and trajectory embedding quality. To address these challenges, we propose TrajMamba, a novel approach for efficient and semantically rich vehicle trajectory learning. TrajMamba introduces a Traj-Mamba Encoder that captures movement patterns by jointly modeling both GPS and road perspectives of trajectories, enabling robust representations of continuous travel behaviors. It also incorporates a Travel Purpose-aware Pre-training procedure to integrate travel purposes into the learned embeddings without introducing extra overhead to embedding calculation. To reduce redundancy in trajectories, TrajMamba features a Knowledge Distillation Pre-training scheme to identify key trajectory points through a learnable mask generator and obtain effective compressed trajectory embeddings. Extensive experiments on two real-world datasets and three downstream tasks show that TrajMamba outperforms state-of-the-art baselines in both efficiency and accuracy.
LGMay 19, 2025
TransferTraj: A Vehicle Trajectory Learning Model for Region and Task TransferabilityTonglong Wei, Yan Lin, Zeyu Zhou et al.
Vehicle GPS trajectories provide valuable movement information that supports various downstream tasks and applications. A desirable trajectory learning model should be able to transfer across regions and tasks without retraining, avoiding the need to maintain multiple specialized models and subpar performance with limited training data. However, each region has its unique spatial features and contexts, which are reflected in vehicle movement patterns and difficult to generalize. Additionally, transferring across different tasks faces technical challenges due to the varying input-output structures required for each task. Existing efforts towards transferability primarily involve learning embedding vectors for trajectories, which perform poorly in region transfer and require retraining of prediction modules for task transfer. To address these challenges, we propose TransferTraj, a vehicle GPS trajectory learning model that excels in both region and task transferability. For region transferability, we introduce RTTE as the main learnable module within TransferTraj. It integrates spatial, temporal, POI, and road network modalities of trajectories to effectively manage variations in spatial context distribution across regions. It also introduces a TRIE module for incorporating relative information of spatial features and a spatial context MoE module for handling movement patterns in diverse contexts. For task transferability, we propose a task-transferable input-output scheme that unifies the input-output structure of different tasks into the masking and recovery of modalities and trajectory points. This approach allows TransferTraj to be pre-trained once and transferred to different tasks without retraining. Extensive experiments on three real-world vehicle trajectory datasets under task transfer, zero-shot, and few-shot region transfer, validating TransferTraj's effectiveness.
LGDec 6, 2024
A Survey and Benchmarking of Spatial-Temporal Traffic Data Imputation ModelsShengnan Guo, Tonglong Wei, Yiheng Huang et al.
Traffic data imputation is a critical preprocessing step in intelligent transportation systems, underpinning the reliability of downstream transportation services. Despite substantial progress in imputation models, model selection and development for practical applications remains challenging due to three key gaps: 1) the absence of a model taxonomy for traffic data imputation to trace the technological development and highlight their distinct features. 2) the lack of unified benchmarking pipeline for fair and reproducible model evaluation across standardized traffic datasets. 3) insufficient in-depth analysis that jointly compare models across multiple dimensions, including effectiveness, computational efficiency and robustness. To this end, this paper proposes practice-oriented taxonomies for traffic data missing patterns and imputation models, systematically cataloging real-world traffic data loss scenarios and analyzing the characteristics of existing models. We further introduce a unified benchmarking pipeline to comprehensively evaluate 11 representative models across various missing patterns and rates, assessing overall performance, performance under challenging scenarios, computational efficiency, and providing visualizations. This work aims to provide a holistic perspective on traffic data imputation and to serve as a practical guideline for model selection and application in intelligent transportation systems.
LGOct 18, 2024
PLMTrajRec: A Scalable and Generalizable Trajectory Recovery Method with Pre-trained Language ModelsTonglong Wei, Yan Lin, Youfang Lin et al.
Spatiotemporal trajectory data is crucial for various applications. However, issues such as device malfunctions and network instability often cause sparse trajectories, leading to lost detailed movement information. Recovering the missing points in sparse trajectories to restore the detailed information is thus essential. Despite recent progress, several challenges remain. First, the lack of large-scale dense trajectory data makes it difficult to train a trajectory recovery model from scratch. Second, the varying spatiotemporal correlations in sparse trajectories make it hard to generalize recovery across different sampling intervals. Third, the lack of location information complicates the extraction of road conditions for missing points. To address these challenges, we propose a novel trajectory recovery model called PLMTrajRec. It leverages the scalability of a pre-trained language model (PLM) and can be fine-tuned with only a limited set of dense trajectories. To handle different sampling intervals in sparse trajectories, we first convert each trajectory's sampling interval and movement features into natural language representations, allowing the PLM to recognize its interval. We then introduce a trajectory encoder to unify trajectories of varying intervals into a single interval and capture their spatiotemporal relationships. To obtain road conditions for missing points, we propose an area flow-guided implicit trajectory prompt, which models road conditions by collecting traffic flows in each region. We also introduce a road condition passing mechanism that uses observed points' road conditions to infer those of the missing points. Experiments on two public trajectory datasets with three sampling intervals each demonstrate the effectiveness, scalability, and generalization ability of PLMTrajRec.
LGFeb 11, 2024
UVTM: Universal Vehicle Trajectory Modeling with ST Feature Domain GenerationYan Lin, Jilin Hu, Shengnan Guo et al.
Vehicle movement is frequently captured in the form of GPS trajectories, i.e., sequences of timestamped GPS locations. Such data is widely used for various tasks such as travel-time estimation, trajectory recovery, and trajectory prediction. A universal vehicle trajectory model could be applied to different tasks, removing the need to maintain multiple specialized models, thereby reducing computational and storage costs. However, creating such a model is challenging when the integrity of trajectory features is compromised, i.e., in scenarios where only partial features are available or the trajectories are sparse. To address these challenges, we propose the Universal Vehicle Trajectory Model (UVTM), which can effectively adapt to different tasks without excessive retraining. UVTM incorporates two specialized designs. First, it divides trajectory features into three distinct domains. Each domain can be masked and generated independently to accommodate tasks with only partially available features. Second, UVTM is pre-trained by reconstructing dense, feature-complete trajectories from sparse, feature-incomplete counterparts, enabling strong performance even when the integrity of trajectory features is compromised. Experiments involving four representative trajectory-related tasks on three real-world vehicle trajectory datasets provide insight into the performance of UVTM and offer evidence that it is capable of meeting its objectives.
AISep 3, 2023
A Survey on Service Route and Time Prediction in Instant Delivery: Taxonomy, Progress, and ProspectsHaomin Wen, Youfang Lin, Lixia Wu et al.
Instant delivery services, such as food delivery and package delivery, have achieved explosive growth in recent years by providing customers with daily-life convenience. An emerging research area within these services is service Route\&Time Prediction (RTP), which aims to estimate the future service route as well as the arrival time of a given worker. As one of the most crucial tasks in those service platforms, RTP stands central to enhancing user satisfaction and trimming operational expenditures on these platforms. Despite a plethora of algorithms developed to date, there is no systematic, comprehensive survey to guide researchers in this domain. To fill this gap, our work presents the first comprehensive survey that methodically categorizes recent advances in service route and time prediction. We start by defining the RTP challenge and then delve into the metrics that are often employed. Following that, we scrutinize the existing RTP methodologies, presenting a novel taxonomy of them. We categorize these methods based on three criteria: (i) type of task, subdivided into only-route prediction, only-time prediction, and joint route\&time prediction; (ii) model architecture, which encompasses sequence-based and graph-based models; and (iii) learning paradigm, including Supervised Learning (SL) and Deep Reinforcement Learning (DRL). Conclusively, we highlight the limitations of current research and suggest prospective avenues. We believe that the taxonomy, progress, and prospects introduced in this paper can significantly promote the development of this field.