Yongqi Dong

LG
h-index44
15papers
106citations
Novelty38%
AI Score47

15 Papers

ROJun 20, 2023
Safe, Efficient, Comfort, and Energy-saving Automated Driving through Roundabout Based on Deep Reinforcement Learning

Henan Yuan, Penghui Li, Bart van Arem et al.

Traffic scenarios in roundabouts pose substantial complexity for automated driving. Manually mapping all possible scenarios into a state space is labor-intensive and challenging. Deep reinforcement learning (DRL) with its ability to learn from interacting with the environment emerges as a promising solution for training such automated driving models. This study explores, employs, and implements various DRL algorithms, namely Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO), and Trust Region Policy Optimization (TRPO) to instruct automated vehicles' driving through roundabouts. The driving state space, action space, and reward function are designed. The reward function considers safety, efficiency, comfort, and energy consumption to align with real-world requirements. All three tested DRL algorithms succeed in enabling automated vehicles to drive through the roundabout. To holistically evaluate the performance of these algorithms, this study establishes an evaluation methodology considering multiple indicators such as safety, efficiency, and comfort level. A method employing the Analytic Hierarchy Process is also developed to weigh these evaluation indicators. Experimental results on various testing scenarios reveal that the TRPO algorithm outperforms DDPG and PPO in terms of safety and efficiency, and PPO performs best in terms of comfort level. Lastly, to verify the model's adaptability and robustness regarding other driving scenarios, this study also deploys the model trained by TRPO to a range of different testing scenarios, e.g., highway driving and merging. Experimental results demonstrate that the TRPO model trained on only roundabout driving scenarios exhibits a certain degree of proficiency in highway driving and merging scenarios. This study provides a foundation for the application of automated driving with DRL in real traffic environments.

LGJun 5, 2023Code
Comparative Study on Semi-supervised Learning Applied for Anomaly Detection in Hydraulic Condition Monitoring System

Yongqi Dong, Kejia Chen, Zhiyuan Ma

Condition-based maintenance is becoming increasingly important in hydraulic systems. However, anomaly detection for these systems remains challenging, especially since that anomalous data is scarce and labeling such data is tedious and even dangerous. Therefore, it is advisable to make use of unsupervised or semi-supervised methods, especially for semi-supervised learning which utilizes unsupervised learning as a feature extraction mechanism to aid the supervised part when only a small number of labels are available. This study systematically compares semi-supervised learning methods applied for anomaly detection in hydraulic condition monitoring systems. Firstly, thorough data analysis and feature learning were carried out to understand the open-sourced hydraulic condition monitoring dataset. Then, various methods were implemented and evaluated including traditional stand-alone semi-supervised learning models (e.g., one-class SVM, Robust Covariance), ensemble models (e.g., Isolation Forest), and deep neural network based models (e.g., autoencoder, Hierarchical Extreme Learning Machine (HELM)). Typically, this study customized and implemented an extreme learning machine based semi-supervised HELM model and verified its superiority over other semi-supervised methods. Extensive experiments show that the customized HELM model obtained state-of-the-art performance with the highest accuracy (99.5%), the lowest false positive rate (0.015), and the best F1-score (0.985) beating other semi-supervised methods.

ROJun 20, 2023Code
Comprehensive Training and Evaluation on Deep Reinforcement Learning for Automated Driving in Various Simulated Driving Maneuvers

Yongqi Dong, Tobias Datema, Vincent Wassenaar et al.

Developing and testing automated driving models in the real world might be challenging and even dangerous, while simulation can help with this, especially for challenging maneuvers. Deep reinforcement learning (DRL) has the potential to tackle complex decision-making and controlling tasks through learning and interacting with the environment, thus it is suitable for developing automated driving while not being explored in detail yet. This study carried out a comprehensive study by implementing, evaluating, and comparing the two DRL algorithms, Deep Q-networks (DQN) and Trust Region Policy Optimization (TRPO), for training automated driving on the highway-env simulation platform. Effective and customized reward functions were developed and the implemented algorithms were evaluated in terms of onlane accuracy (how well the car drives on the road within the lane), efficiency (how fast the car drives), safety (how likely the car is to crash into obstacles), and comfort (how much the car makes jerks, e.g., suddenly accelerates or brakes). Results show that the TRPO-based models with modified reward functions delivered the best performance in most cases. Furthermore, to train a uniform driving model that can tackle various driving maneuvers besides the specific ones, this study expanded the highway-env and developed an extra customized training environment, namely, ComplexRoads, integrating various driving maneuvers and multiple road scenarios together. Models trained on the designed ComplexRoads environment can adapt well to other driving maneuvers with promising overall performance. Lastly, several functionalities were added to the highway-env to implement this work. The codes are open on GitHub at https://github.com/alaineman/drlcarsim-paper.

LGJul 21, 2022
Comparative Study on Supervised versus Semi-supervised Machine Learning for Anomaly Detection of In-vehicle CAN Network

Yongqi Dong, Kejia Chen, Yinxuan Peng et al.

As the central nerve of the intelligent vehicle control system, the in-vehicle network bus is crucial to the security of vehicle driving. One of the best standards for the in-vehicle network is the Controller Area Network (CAN bus) protocol. However, the CAN bus is designed to be vulnerable to various attacks due to its lack of security mechanisms. To enhance the security of in-vehicle networks and promote the research in this area, based upon a large scale of CAN network traffic data with the extracted valuable features, this study comprehensively compared fully-supervised machine learning with semi-supervised machine learning methods for CAN message anomaly detection. Both traditional machine learning models (including single classifier and ensemble models) and neural network based deep learning models are evaluated. Furthermore, this study proposed a deep autoencoder based semi-supervised learning method applied for CAN message anomaly detection and verified its superiority over other semi-supervised methods. Extensive experiments show that the fully-supervised methods generally outperform semi-supervised ones as they are using more information as inputs. Typically the developed XGBoost based model obtained state-of-the-art performance with the best accuracy (98.65%), precision (0.9853), and ROC AUC (0.9585) beating other methods reported in the literature.

CVFeb 3Code
Efficient Sequential Neural Network with Spatial-Temporal Attention and Linear LSTM for Robust Lane Detection Using Multi-Frame Images

Sandeep Patil, Yongqi Dong, Haneen Farah et al.

Lane detection is a crucial perception task for all levels of automated vehicles (AVs) and Advanced Driver Assistance Systems, particularly in mixed-traffic environments where AVs must interact with human-driven vehicles (HDVs) and challenging traffic scenarios. Current methods lack versatility in delivering accurate, robust, and real-time compatible lane detection, especially vision-based methods often neglect critical regions of the image and their spatial-temporal (ST) salience, leading to poor performance in difficult circumstances such as serious occlusion and dazzle lighting. This study introduces a novel sequential neural network model with a spatial-temporal attention mechanism to focus on key features of lane lines and exploit salient ST correlations among continuous image frames. The proposed model, built on a standard encoder-decoder structure and common neural network backbones, is trained and evaluated on three large-scale open-source datasets. Extensive experiments demonstrate the strength and robustness of the proposed model, outperforming state-of-the-art methods in various testing scenarios. Furthermore, with the ST attention mechanism, the developed sequential neural network models exhibit fewer parameters and reduced Multiply-Accumulate Operations (MACs) compared to baseline sequential models, highlighting their computational efficiency. Relevant data, code, and models are released at https://doi.org/10.4121/4619cab6-ae4a-40d5-af77-582a77f3d821.

51.2ROMar 10
Latent World Models for Automated Driving: A Unified Taxonomy, Evaluation Framework, and Open Challenges

Rongxiang Zeng, Yongqi Dong

Emerging generative world models and vision-language-action (VLA) systems are rapidly reshaping automated driving by enabling scalable simulation, long-horizon forecasting, and capability-rich decision making. Across these directions, latent representations serve as the central computational substrate: they compress high-dimensional multi-sensor observations, enable temporally coherent rollouts, and provide interfaces for planning, reasoning, and controllable generation. This paper proposes a unifying latent-space framework that synthesizes recent progress in world models for automated driving. The framework organizes the design space by the target and form of latent representations (latent worlds, latent actions, latent generators; continuous states, discrete tokens, and hybrids) and by structural priors for geometry, topology, and semantics. Building on this taxonomy, the paper articulates five cross-cutting internal mechanics (i.e, structural isomorphism, long-horizon temporal stability, semantic and reasoning alignment, value-aligned objectives and post-training, as well as adaptive computation and deliberation) and connects these design choices to robustness, generalization, and deployability. The work also proposes concrete evaluation prescriptions, including a closed-loop metric suite and a resource-aware deliberation cost, designed to reduce the open-loop / closed-loop mismatch. Finally, the paper identifies actionable research directions toward advancing latent world model for decision-ready, verifiable, and resource-efficient automated driving.

LGMay 2, 2024
Leverage Multi-source Traffic Demand Data Fusion with Transformer Model for Urban Parking Prediction

Yin Huang, Yongqi Dong, Youhua Tang et al.

The escalation in urban private car ownership has worsened the urban parking predicament, necessitating effective parking availability prediction for urban planning and management. However, the existing prediction methods suffer from low prediction accuracy with the lack of spatial-temporal correlation features related to parking volume, and neglect of flow patterns and correlations between similar parking lots within certain areas. To address these challenges, this study proposes a parking availability prediction framework integrating spatial-temporal deep learning with multi-source data fusion, encompassing traffic demand data from multiple sources (e.g., metro, bus, taxi services), and parking lot data. The framework is based on the Transformer as the spatial-temporal deep learning model and leverages K-means clustering to establish parking cluster zones, extracting and integrating traffic demand characteristics from various transportation modes (i.e., metro, bus, online ride-hailing, and taxi) connected to parking lots. Real-world empirical data was used to verify the effectiveness of the proposed method compared with different machine learning, deep learning, and traditional statistical models for predicting parking availability. Experimental results reveal that, with the proposed pipeline, the developed Transformer model outperforms other models in terms of various metrics, e.g., Mean Squared Error (MSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). By fusing multi-source demanding data with spatial-temporal deep learning techniques, this approach offers the potential to develop parking availability prediction systems that furnish more accurate and timely information to both drivers and urban planners, thereby fostering more efficient and sustainable urban mobility.

CVDec 7, 2023
Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning

Yongqi Dong, Xingmin Lu, Ruohan Li et al.

The burgeoning navigation services using digital maps provide great convenience to drivers. Nevertheless, the presence of anomalies in lane rendering map images occasionally introduces potential hazards, as such anomalies can be misleading to human drivers and consequently contribute to unsafe driving conditions. In response to this concern and to accurately and effectively detect the anomalies, this paper transforms lane rendering image anomaly detection into a classification problem and proposes a four-phase pipeline consisting of data pre-processing, self-supervised pre-training with the masked image modeling (MiM) method, customized fine-tuning using cross-entropy based loss with label smoothing, and post-processing to tackle it leveraging state-of-the-art deep learning techniques, especially those involving Transformer models. Various experiments verify the effectiveness of the proposed pipeline. Results indicate that the proposed pipeline exhibits superior performance in lane rendering image anomaly detection, and notably, the self-supervised pre-training with MiM can greatly enhance the detection accuracy while significantly reducing the total training time. For instance, employing the Swin Transformer with Uniform Masking as self-supervised pretraining (Swin-Trans-UM) yielded a heightened accuracy at 94.77% and an improved Area Under The Curve (AUC) score of 0.9743 compared with the pure Swin Transformer without pre-training (Swin-Trans) with an accuracy of 94.01% and an AUC of 0.9498. The fine-tuning epochs were dramatically reduced to 41 from the original 280. In conclusion, the proposed pipeline, with its incorporation of self-supervised pre-training using MiM and other advanced deep learning techniques, emerges as a robust solution for enhancing the accuracy and efficiency of lane rendering image anomaly detection in digital navigation systems.

DLJan 20
Measuring the State of Open Science in Transportation Using Large Language Models

Junyi Ji, Ruth Lu, Linda Belkessa et al.

Open science initiatives have strengthened scientific integrity and accelerated research progress across many fields, but the state of their practice within transportation research remains under-investigated. Key features of open science, defined here as data and code availability, are difficult to extract due to the inherent complexity of the field. Previous work has either been limited to small-scale studies due to the labor-intensive nature of manual analysis or has relied on large-scale bibliometric approaches that sacrifice contextual richness. This paper introduces an automatic and scalable feature-extraction pipeline to measure data and code availability in transportation research. We employ Large Language Models (LLMs) for this task and validate their performance against a manually curated dataset and through an inter-rater agreement analysis. We applied this pipeline to examine 10,724 research articles published in the Transportation Research Part series of journals between 2019 and 2024. Our analysis found that only 5% of quantitative papers shared a code repository, 4% of quantitative papers shared a data repository, and about 3% of papers shared both, with trends differing across journals, topics, and geographic regions. We found no significant difference in citation counts or review duration between papers that provided data and code and those that did not, suggesting a misalignment between open science efforts and traditional academic metrics. Consequently, encouraging these practices will likely require structural interventions from journals and funding agencies to supplement the lack of direct author incentives. The pipeline developed in this study can be readily scaled to other journals, representing a critical step toward the automated measurement and monitoring of open science practices in transportation research.

LGSep 4, 2025
Parking Availability Prediction via Fusing Multi-Source Data with A Self-Supervised Learning Enhanced Spatio-Temporal Inverted Transformer

Yin Huang, Yongqi Dong, Youhua Tang et al.

The rapid growth of private car ownership has worsened the urban parking predicament, underscoring the need for accurate and effective parking availability prediction to support urban planning and management. To address key limitations in modeling spatio-temporal dependencies and exploiting multi-source data for parking availability prediction, this study proposes a novel approach with SST-iTransformer. The methodology leverages K-means clustering to establish parking cluster zones (PCZs), extracting and integrating traffic demand characteristics from various transportation modes (i.e., metro, bus, online ride-hailing, and taxi) associated with the targeted parking lots. Upgraded on vanilla iTransformer, SST-iTransformer integrates masking-reconstruction-based pretext tasks for self-supervised spatio-temporal representation learning, and features an innovative dual-branch attention mechanism: Series Attention captures long-term temporal dependencies via patching operations, while Channel Attention models cross-variate interactions through inverted dimensions. Extensive experiments using real-world data from Chengdu, China, demonstrate that SST-iTransformer outperforms baseline deep learning models (including Informer, Autoformer, Crossformer, and iTransformer), achieving state-of-the-art performance with the lowest mean squared error (MSE) and competitive mean absolute error (MAE). Comprehensive ablation studies quantitatively reveal the relative importance of different data sources: incorporating ride-hailing data provides the largest performance gains, followed by taxi, whereas fixed-route transit features (bus/metro) contribute marginally. Spatial correlation analysis further confirms that excluding historical data from correlated parking lots within PCZs leads to substantial performance degradation, underscoring the importance of modeling spatial dependencies.

LGMay 2, 2025
A Self-Supervised Transformer for Unusable Shared Bike Detection

Yin Huang, Yongqi Dong, Youhua Tang et al.

The rapid expansion of bike-sharing systems (BSS) has greatly improved urban "last-mile" connectivity, yet large-scale deployments face escalating operational challenges, particularly in detecting faulty bikes. Existing detection approaches either rely on static model-based thresholds that overlook dynamic spatiotemporal (ST) usage patterns or employ supervised learning methods that struggle with label scarcity and class imbalance. To address these limitations, this paper proposes a novel Self-Supervised Transformer (SSTransformer) framework for automatically detecting unusable shared bikes, leveraging ST features extracted from GPS trajectories and trip records. The model incorporates a self-supervised pre-training strategy to enhance its feature extraction capabilities, followed by fine-tuning for efficient status recognition. In the pre-training phase, the Transformer encoder learns generalized representations of bike movement via a self-supervised objective; in the fine-tuning phase, the encoder is adapted to a downstream binary classification task. Comprehensive experiments on a real-world dataset of 10,730 bikes (1,870 unusable, 8,860 normal) from Chengdu, China, demonstrate that SSTransformer significantly outperforms traditional machine learning, ensemble learning, and deep learning baselines, achieving the best accuracy (97.81%), precision (0.8889), and F1-score (0.9358). This work highlights the effectiveness of self-supervised Transformer on ST data for capturing complex anomalies in BSS, paving the way toward more reliable and scalable maintenance solutions for shared mobility.

ROJan 10, 2025
Towards Developing Socially Compliant Automated Vehicles: Advances, Expert Insights, and A Conceptual Framework

Yongqi Dong, Bart van Arem, Haneen Farah

Automated Vehicles (AVs) hold promise for revolutionizing transportation by improving road safety, traffic efficiency, and overall mobility. Despite the steady advancement in high-level AVs in recent years, the transition to full automation entails a period of mixed traffic, where AVs of varying automation levels coexist with human-driven vehicles (HDVs). Making AVs socially compliant and understood by human drivers is expected to improve the safety and efficiency of mixed traffic. Thus, ensuring AVs' compatibility with HDVs and social acceptance is crucial for their successful and seamless integration into mixed traffic. However, research in this critical area of developing Socially Compliant AVs (SCAVs) remains sparse. This study carries out the first comprehensive scoping review to assess the current state of the art in developing SCAVs, identifying key concepts, methodological approaches, and research gaps. An informal expert interview was also conducted to discuss the literature review results and identify critical research gaps and expectations towards SCAVs. Based on the scoping review and expert interview input, a conceptual framework is proposed for the development of SCAVs. The conceptual framework is evaluated using an online survey targeting researchers, technicians, policymakers, and other relevant professionals worldwide. The survey results provide valuable validation and insights, affirming the significance of the proposed conceptual framework in tackling the challenges of integrating AVs into mixed-traffic environments. Additionally, future research perspectives and suggestions are discussed, contributing to the research and development agenda of SCAVs.

LGDec 7, 2023
Data-Driven Semi-Supervised Machine Learning with Safety Indicators for Abnormal Driving Behavior Detection

Yongqi Dong, Lanxin Zhang, Haneen Farah et al.

Detecting abnormal driving behavior is critical for road traffic safety and the evaluation of drivers' behavior. With the advancement of machine learning (ML) algorithms and the accumulation of naturalistic driving data, many ML models have been adopted for abnormal driving behavior detection (also referred to in this paper as "anomalies"). Most existing ML-based detectors rely on (fully) supervised ML methods, which require substantial labeled data. However, ground truth labels are not always available in the real world, and labeling large amounts of data is tedious. Thus, there is a need to explore unsupervised or semi-supervised methods to make the anomaly detection process more feasible and efficient. To fill this research gap, this study analyzes large-scale real-world data revealing several abnormal driving behaviors (e.g., sudden acceleration, rapid lane-changing) and develops a hierarchical extreme learning machine (HELM)-based semi-supervised ML method using partly labeled data to detect the identified abnormal driving behaviors. Moreover, previous ML-based approaches predominantly utilized basic vehicle motion features (such as velocity and acceleration) to label and detect abnormal driving behaviors, while this study seeks to introduce event-level safety indicators as input features for ML models to improve detection performance. Results from extensive experiments demonstrate the effectiveness of the proposed semi-supervised ML model with the introduced safety indicators serving as important features. The proposed semi-supervised ML method outperforms other baseline semi-supervised or unsupervised methods: for example, it delivers the best accuracy at 99.58% and the best F1-score at 0.9913. The ablation study further highlights the significance of safety indicators for advancing the detection performance of abnormal driving behaviors.

CVMay 26, 2023
Robust Lane Detection through Self Pre-training with Masked Sequential Autoencoders and Fine-tuning with Customized PolyLoss

Ruohan Li, Yongqi Dong

Lane detection is crucial for vehicle localization which makes it the foundation for automated driving and many intelligent and advanced driving assistant systems. Available vision-based lane detection methods do not make full use of the valuable features and aggregate contextual information, especially the interrelationships between lane lines and other regions of the images in continuous frames. To fill this research gap and upgrade lane detection performance, this paper proposes a pipeline consisting of self pre-training with masked sequential autoencoders and fine-tuning with customized PolyLoss for the end-to-end neural network models using multi-continuous image frames. The masked sequential autoencoders are adopted to pre-train the neural network models with reconstructing the missing pixels from a random masked image as the objective. Then, in the fine-tuning segmentation phase where lane detection segmentation is performed, the continuous image frames are served as the inputs, and the pre-trained model weights are transferred and further updated using the backpropagation mechanism with customized PolyLoss calculating the weighted errors between the output lane detection results and the labeled ground truth. Extensive experiment results demonstrate that, with the proposed pipeline, the lane detection model performance on both normal and challenging scenes can be advanced beyond the state-of-the-art, delivering the best testing accuracy (98.38%), precision (0.937), and F1-measure (0.924) on the normal scene testing set, together with the best overall accuracy (98.36%) and precision (0.844) in the challenging scene test set, while the training time can be substantially shortened.

CVOct 5, 2021
A Hybrid Spatial-temporal Deep Learning Architecture for Lane Detection

Yongqi Dong, Sandeep Patil, Bart van Arem et al.

Accurate and reliable lane detection is vital for the safe performance of lane-keeping assistance and lane departure warning systems. However, under certain challenging circumstances, it is difficult to get satisfactory performance in accurately detecting the lanes from one single image as mostly done in current literature. Since lane markings are continuous lines, the lanes that are difficult to be accurately detected in the current single image can potentially be better deduced if information from previous frames is incorporated. This study proposes a novel hybrid spatial-temporal (ST) sequence-to-one deep learning architecture. This architecture makes full use of the ST information in multiple continuous image frames to detect the lane markings in the very last frame. Specifically, the hybrid model integrates the following aspects: (a) the single image feature extraction module equipped with the spatial convolutional neural network; (b) the ST feature integration module constructed by ST recurrent neural network; (c) the encoder-decoder structure, which makes this image segmentation problem work in an end-to-end supervised learning format. Extensive experiments reveal that the proposed model architecture can effectively handle challenging driving scenes and outperforms available state-of-the-art methods.