16.6CRApr 5
Streaming ChainYi Lyu
Blockchain and blockchain-inspired decentralized applications are on the rise thanks to their unique characteristics such as their decentralized nature, anonymity, and tamper-proof nature; however, blockchain transactions tend to experience long end-to-end latency, with a major contributor being the block creation step, which might block transaction processing. There are two approaches to ameliorate this overhead: speeding up the block creation process, or processing transactions before block creation finishes. In this project, we work towards designing a self-adaptive block creation process that automatically selects optimal configurations based on workload and hardware resources by defining mathematical models to predict transaction latency based on design and environmental parameters, developing measurement techniques to collect performance-related metrics in docker-hosted blockchain systems and observing trends to build intuition, and defining a mathematical model to predict transaction success rate under various key accessing patterns and block size configurations, validating it with simulation-based measurements.
9.6DCApr 2
ModTrans: Translating Real-world Models for Distributed Training SimulatorYi Lyu
Large-scale distributed training has been a research hot spot in machine learning systems for industry and academia in recent years. However, conducting experiments without physical machines and corresponding resources is difficult. One solution is to leverage distributed training simulators, but current ones like ASTRA-sim do not support importing real-world developed models, which poses challenges for ML researchers seeking to use them. Based on this challenge, we developed ModTrans, a translator supporting format translation from any real-world model to the ASTRA-sim simulator's input, removing the barrier between machine learning experts and machine learning system researchers. The experiment results show that ModTrans's cost is negligible.
LGApr 15, 2024
Efflex: Efficient and Flexible Pipeline for Spatio-Temporal Trajectory Graph Modeling and Representation LearningMing Cheng, Ziyi Zhou, Bowen Zhang et al.
In the landscape of spatio-temporal data analytics, effective trajectory representation learning is paramount. To bridge the gap of learning accurate representations with efficient and flexible mechanisms, we introduce Efflex, a comprehensive pipeline for transformative graph modeling and representation learning of the large-volume spatio-temporal trajectories. Efflex pioneers the incorporation of a multi-scale k-nearest neighbors (KNN) algorithm with feature fusion for graph construction, marking a leap in dimensionality reduction techniques by preserving essential data features. Moreover, the groundbreaking graph construction mechanism and the high-performance lightweight GCN increase embedding extraction speed by up to 36 times faster. We further offer Efflex in two versions, Efflex-L for scenarios demanding high accuracy, and Efflex-B for environments requiring swift data processing. Comprehensive experimentation with the Porto and Geolife datasets validates our approach, positioning Efflex as the state-of-the-art in the domain. Such enhancements in speed and accuracy highlight the versatility of Efflex, underscoring its wide-ranging potential for deployment in time-sensitive and computationally constrained applications.
LGApr 11, 2024
VeTraSS: Vehicle Trajectory Similarity Search Through Graph Modeling and Representation LearningMing Cheng, Bowen Zhang, Ziyu Wang et al.
Trajectory similarity search plays an essential role in autonomous driving, as it enables vehicles to analyze the information and characteristics of different trajectories to make informed decisions and navigate safely in dynamic environments. Existing work on the trajectory similarity search task primarily utilizes sequence-processing algorithms or Recurrent Neural Networks (RNNs), which suffer from the inevitable issues of complicated architecture and heavy training costs. Considering the intricate connections between trajectories, using Graph Neural Networks (GNNs) for data modeling is feasible. However, most methods directly use existing mathematical graph structures as the input instead of constructing specific graphs from certain vehicle trajectory data. This ignores such data's unique and dynamic characteristics. To bridge such a research gap, we propose VeTraSS -- an end-to-end pipeline for Vehicle Trajectory Similarity Search. Specifically, VeTraSS models the original trajectory data into multi-scale graphs, and generates comprehensive embeddings through a novel multi-layer attention-based GNN. The learned embeddings can be used for searching similar vehicle trajectories. Extensive experiments on the Porto and Geolife datasets demonstrate the effectiveness of VeTraSS, where our model outperforms existing work and reaches the state-of-the-art. This demonstrates the potential of VeTraSS for trajectory analysis and safe navigation in self-driving vehicles in the real world.