LGMay 24, 2025Code
A Survey of Large Language Models for Data Challenges in GraphsMengran Li, Pengyu Zhang, Wenbin Xing et al.
Graphs are a widely used paradigm for representing non-Euclidean data, with applications ranging from social network analysis to biomolecular prediction. While graph learning has achieved remarkable progress, real-world graph data presents a number of challenges that significantly hinder the learning process. In this survey, we focus on four fundamental data-centric challenges: (1) Incompleteness, real-world graphs have missing nodes, edges, or attributes; (2) Imbalance, the distribution of the labels of nodes or edges and their structures for real-world graphs are highly skewed; (3) Cross-domain Heterogeneity, graphs from different domains exhibit incompatible feature spaces or structural patterns; and (4) Dynamic Instability, graphs evolve over time in unpredictable ways. Recently, Large Language Models (LLMs) offer the potential to tackle these challenges by leveraging rich semantic reasoning and external knowledge. This survey focuses on how LLMs can address four fundamental data-centric challenges in graph-structured data, thereby improving the effectiveness of graph learning. For each challenge, we review both traditional solutions and modern LLM-driven approaches, highlighting how LLMs contribute unique advantages. Finally, we discuss open research questions and promising future directions in this emerging interdisciplinary field. To support further exploration, we have curated a repository of recent advances on graph learning challenges: https://github.com/limengran98/Awesome-Literature-Graph-Learning-Challenges.
LGApr 8, 2025
MM-STFlowNet: A Transportation Hub-Oriented Multi-Mode Passenger Flow Prediction Method via Spatial-Temporal Dynamic Graph ModelingRonghui Zhang, Wenbin Xing, Mengran Li et al.
Accurate and refined passenger flow prediction is essential for optimizing the collaborative management of multiple collection and distribution modes in large-scale transportation hubs. Traditional methods often focus only on the overall passenger volume, neglecting the interdependence between different modes within the hub. To address this limitation, we propose MM-STFlowNet, a comprehensive multi-mode prediction framework grounded in dynamic spatial-temporal graph modeling. Initially, an integrated temporal feature processing strategy is implemented using signal decomposition and convolution techniques to address data spikes and high volatility. Subsequently, we introduce the Spatial-Temporal Dynamic Graph Convolutional Recurrent Network (STDGCRN) to capture detailed spatial-temporal dependencies across multiple traffic modes, enhanced by an adaptive channel attention mechanism. Finally, the self-attention mechanism is applied to incorporate various external factors, further enhancing prediction accuracy. Experiments on a real-world dataset from Guangzhounan Railway Station in China demonstrate that MM-STFlowNet achieves state-of-the-art performance, particularly during peak periods, providing valuable insight for transportation hub management.
LGSep 8, 2025
Group Effect Enhanced Generative Adversarial Imitation Learning for Individual Travel Behavior Modeling under IncentivesYuanyuan Wu, Zhenlin Qin, Leizhen Wang et al.
Understanding and modeling individual travel behavior responses is crucial for urban mobility regulation and policy evaluation. The Markov decision process (MDP) provides a structured framework for dynamic travel behavior modeling at the individual level. However, solving an MDP in this context is highly data-intensive and faces challenges of data quantity, spatial-temporal coverage, and situational diversity. To address these, we propose a group-effect-enhanced generative adversarial imitation learning (gcGAIL) model that improves the individual behavior modeling efficiency by leveraging shared behavioral patterns among passenger groups. We validate the gcGAIL model using a public transport fare-discount case study and compare against state-of-the-art benchmarks, including adversarial inverse reinforcement learning (AIRL), baseline GAIL, and conditional GAIL. Experimental results demonstrate that gcGAIL outperforms these methods in learning individual travel behavior responses to incentives over time in terms of accuracy, generalization, and pattern demonstration efficiency. Notably, gcGAIL is robust to spatial variation, data sparsity, and behavioral diversity, maintaining strong performance even with partial expert demonstrations and underrepresented passenger groups. The gcGAIL model predicts the individual behavior response at any time, providing the basis for personalized incentives to induce sustainable behavior changes (better timing of incentive injections).
AIApr 12, 2021
Learning dynamic and hierarchical traffic spatiotemporal features with TransformerHaoyang Yan, Xiaolei Ma
Traffic forecasting is an indispensable part of Intelligent transportation systems (ITS), and long-term network-wide accurate traffic speed forecasting is one of the most challenging tasks. Recently, deep learning methods have become popular in this domain. As traffic data are physically associated with road networks, most proposed models treat it as a spatiotemporal graph modeling problem and use Graph Convolution Network (GCN) based methods. These GCN-based models highly depend on a predefined and fixed adjacent matrix to reflect the spatial dependency. However, the predefined fixed adjacent matrix is limited in reflecting the actual dependence of traffic flow. This paper proposes a novel model, Traffic Transformer, for spatial-temporal graph modeling and long-term traffic forecasting to overcome these limitations. Transformer is the most popular framework in Natural Language Processing (NLP). And by adapting it to the spatiotemporal problem, Traffic Transformer hierarchically extracts spatiotemporal features through data dynamically by multi-head attention and masked multi-head attention mechanism, and fuse these features for traffic forecasting. Furthermore, analyzing the attention weight matrixes can find the influential part of road networks, allowing us to learn the traffic networks better. Experimental results on the public traffic network datasets and real-world traffic network datasets generated by ourselves demonstrate our proposed model achieves better performance than the state-of-the-art ones.
LGOct 11, 2020
Mining Truck Platooning Patterns Through Massive Trajectory DataXiaolei Ma, Enze Huo, Haiyang Yu et al.
Truck platooning refers to a series of trucks driving in close proximity via communication technologies, and it is considered one of the most implementable systems of connected and automated vehicles, bringing huge energy savings and safety improvements. Properly planning platoons and evaluating the potential of truck platooning are crucial to trucking companies and transportation authorities. This study proposes a series of data mining approaches to learn spontaneous truck platooning patterns from massive trajectories. An enhanced map matching algorithm is developed to identify truck headings by using digital map data, followed by an adaptive spatial clustering algorithm to detect instantaneous co-moving truck sets. These sets are then aggregated to find the network-wide maximum platoon duration and size through frequent itemset mining for computational efficiency. We leverage real GPS data collected from truck fleeting systems in Liaoning Province, China, to evaluate platooning performance and successfully extract spatiotemporal platooning patterns. Results show that approximately 36% spontaneous truck platoons can be coordinated by speed adjustment without changing routes and schedules. The average platooning distance and duration ratios for these platooned trucks are 9.6% and 9.9%, respectively, leading to a 2.8% reduction in total fuel consumption. We also distinguish the optimal platooning periods and space headways for national freeways and trunk roads, and prioritize the road segments with high possibilities of truck platooning. The derived results are reproducible, providing useful policy implications and operational strategies for large-scale truck platoon planning and roadside infrastructure construction.
LGNov 1, 2018
Forecasting Transportation Network Speed Using Deep Capsule Networks with Nested LSTM ModelsXiaolei Ma, Yi Li, Zhiyong Cui et al.
Accurate and reliable traffic forecasting for complicated transportation networks is of vital importance to modern transportation management. The complicated spatial dependencies of roadway links and the dynamic temporal patterns of traffic states make it particularly challenging. To address these challenges, we propose a new capsule network (CapsNet) to extract the spatial features of traffic networks and utilize a nested LSTM (NLSTM) structure to capture the hierarchical temporal dependencies in traffic sequence data. A framework for network-level traffic forecasting is also proposed by sequentially connecting CapsNet and NLSTM. On the basis of literature review, our study is the first to adopt CapsNet and NLSTM in the field of traffic forecasting. An experiment on a Beijing transportation network with 278 links shows that the proposed framework with the capability of capturing complicated spatiotemporal traffic patterns outperforms multiple state-of-the-art traffic forecasting baseline models. The superiority and feasibility of CapsNet and NLSTM are also demonstrated, respectively, by visualizing and quantitatively evaluating the experimental results.
LGMay 7, 2017
Spatiotemporal Recurrent Convolutional Networks for Traffic Prediction in Transportation NetworksHaiyang Yu, Zhihai Wu, Shuqin Wang et al.
Predicting large-scale transportation network traffic has become an important and challenging topic in recent decades. Inspired by the domain knowledge of motion prediction, in which the future motion of an object can be predicted based on previous scenes, we propose a network grid representation method that can retain the fine-scale structure of a transportation network. Network-wide traffic speeds are converted into a series of static images and input into a novel deep architecture, namely, spatiotemporal recurrent convolutional networks (SRCNs), for traffic forecasting. The proposed SRCNs inherit the advantages of deep convolutional neural networks (DCNNs) and long short-term memory (LSTM) neural networks. The spatial dependencies of network-wide traffic can be captured by DCNNs, and the temporal dynamics can be learned by LSTMs. An experiment on a Beijing transportation network with 278 links demonstrates that SRCNs outperform other deep learning-based algorithms in both short-term and long-term traffic prediction.
LGJan 16, 2017
Learning Traffic as Images: A Deep Convolutional Neural Network for Large-Scale Transportation Network Speed PredictionXiaolei Ma, Zhuang Dai, Zhengbing He et al.
This paper proposes a convolutional neural network (CNN)-based method that learns traffic as images and predicts large-scale, network-wide traffic speed with a high accuracy. Spatiotemporal traffic dynamics are converted to images describing the time and space relations of traffic flow via a two-dimensional time-space matrix. A CNN is applied to the image following two consecutive steps: abstract traffic feature extraction and network-wide traffic speed prediction. The effectiveness of the proposed method is evaluated by taking two real-world transportation networks, the second ring road and north-east transportation network in Beijing, as examples, and comparing the method with four prevailing algorithms, namely, ordinary least squares, k-nearest neighbors, artificial neural network, and random forest, and three deep learning architectures, namely, stacked autoencoder, recurrent neural network, and long-short-term memory network. The results show that the proposed method outperforms other algorithms by an average accuracy improvement of 42.91% within an acceptable execution time. The CNN can train the model in a reasonable time and, thus, is suitable for large-scale transportation networks.