MLOct 21, 2022
Correlating sparse sensing for large-scale traffic speed estimation: A Laplacian-enhanced low-rank tensor kriging approachTong Nie, Guoyang Qin, Yunpeng Wang et al.
Traffic speed is central to characterizing the fluidity of the road network. Many transportation applications rely on it, such as real-time navigation, dynamic route planning, and congestion management. Rapid advances in sensing and communication techniques make traffic speed detection easier than ever. However, due to sparse deployment of static sensors or low penetration of mobile sensors, speeds detected are incomplete and far from network-wide use. In addition, sensors are prone to error or missing data due to various kinds of reasons, speeds from these sensors can become highly noisy. These drawbacks call for effective techniques to recover credible estimates from the incomplete data. In this work, we first identify the issue as a spatiotemporal kriging problem and propose a Laplacian enhanced low-rank tensor completion (LETC) framework featuring both lowrankness and multi-dimensional correlations for large-scale traffic speed kriging under limited observations. To be specific, three types of speed correlation including temporal continuity, temporal periodicity, and spatial proximity are carefully chosen and simultaneously modeled by three different forms of graph Laplacian, named temporal graph Fourier transform, generalized temporal consistency regularization, and diffusion graph regularization. We then design an efficient solution algorithm via several effective numeric techniques to scale up the proposed model to network-wide kriging. By performing experiments on two public million-level traffic speed datasets, we finally draw the conclusion and find our proposed LETC achieves the state-of-the-art kriging performance even under low observation rates, while at the same time saving more than half computing time compared with baseline methods. Some insights into spatiotemporal traffic data modeling and kriging at the network level are provided as well.
LGMar 16, 2022Code
Example PerplexityNevin L. Zhang, Weiyan Xie, Zhi Lin et al.
Some examples are easier for humans to classify than others. The same should be true for deep neural networks (DNNs). We use the term example perplexity to refer to the level of difficulty of classifying an example. In this paper, we propose a method to measure the perplexity of an example and investigate what factors contribute to high example perplexity. The related codes and resources are available at https://github.com/vaynexie/Example-Perplexity.
CVJul 22, 2024Code
X-Recon: Learning-based Patient-specific High-Resolution CT Reconstruction from Orthogonal X-Ray ImagesYunpeng Wang, Kang Wang, Yaoyao Zhuo et al.
Rapid and accurate diagnosis of pneumothorax, utilizing chest X-ray and computed tomography (CT), is crucial for assisted diagnosis. Chest X-ray is commonly used for initial localization of pneumothorax, while CT ensures accurate quantification. However, CT scans involve high radiation doses and can be costly. To achieve precise quantitative diagnosis while minimizing radiation exposure, we proposed X-Recon, a CT ultra-sparse reconstruction network based on ortho-lateral chest X-ray images. X-Recon integrates generative adversarial networks (GANs), including a generator with a multi-scale fusion rendering module and a discriminator enhanced by 3D coordinate convolutional layers, designed to facilitate CT reconstruction. To improve precision, a projective spatial transformer is utilized to incorporate multi-angle projection loss. Additionally, we proposed PTX-Seg, a zero-shot pneumothorax segmentation algorithm, combining image processing techniques with deep-learning models for the segmentation of air-accumulated regions and lung structures. Experiments on a large-scale dataset demonstrate its superiority over existing approaches. X-Recon achieved a significantly higher reconstruction resolution with a higher average spatial resolution and a lower average slice thickness. The reconstruction metrics achieved state-of-the-art performance in terms of several metrics including peak signal-to-noise ratio. The zero-shot segmentation algorithm, PTX-Seg, also demonstrated high segmentation precision for the air-accumulated region, the left lung, and the right lung. Moreover, the consistency analysis for the pneumothorax chest occupancy ratio between reconstructed CT and original CT obtained a high correlation coefficient. Code will be available at: https://github.com/wangyunpengbio/X-Recon
MLMar 10, 2023
Towards better traffic volume estimation: Jointly addressing the underdetermination and nonequilibrium problems with correlation-adaptive GNNsTong Nie, Guoyang Qin, Yunpeng Wang et al.
Traffic volume is an indispensable ingredient to provide fine-grained information for traffic management and control. However, due to limited deployment of traffic sensors, obtaining full-scale volume information is far from easy. Existing works on this topic primarily focus on improving the overall estimation accuracy of a particular method and ignore the underlying challenges of volume estimation, thereby having inferior performances on some critical tasks. This paper studies two key problems with regard to traffic volume estimation: (1) underdetermined traffic flows caused by undetected movements, and (2) non-equilibrium traffic flows arise from congestion propagation. Here we demonstrate a graph-based deep learning method that can offer a data-driven, model-free and correlation adaptive approach to tackle the above issues and perform accurate network-wide traffic volume estimation. Particularly, in order to quantify the dynamic and nonlinear relationships between traffic speed and volume for the estimation of underdetermined flows, a speed patternadaptive adjacent matrix based on graph attention is developed and integrated into the graph convolution process, to capture non-local correlations between sensors. To measure the impacts of non-equilibrium flows, a temporal masked and clipped attention combined with a gated temporal convolution layer is customized to capture time-asynchronous correlations between upstream and downstream sensors. We then evaluate our model on a real-world highway traffic volume dataset and compare it with several benchmark models. It is demonstrated that the proposed model achieves high estimation accuracy even under 20% sensor coverage rate and outperforms other baselines significantly, especially on underdetermined and non-equilibrium flow locations. Furthermore, comprehensive quantitative model analysis are also carried out to justify the model designs.
ROApr 22
Toward Cooperative Driving in Mixed Traffic: An Adaptive Potential Game-Based Approach with Field Test VerificationShiyu Fang, Xiaocong Zhao, Xuekai Liu et al.
Connected autonomous vehicles (CAVs), which represent a significant advancement in autonomous driving technology, have the potential to greatly increase traffic safety and efficiency through cooperative decision-making. However, existing methods often overlook the individual needs and heterogeneity of cooperative participants, making it difficult to transfer them to environments where they coexist with human-driven vehicles (HDVs).To address this challenge, this paper proposes an adaptive potential game (APG) cooperative driving framework. First, the system utility function is established on the basis of a general form of individual utility and its monotonic relationship, allowing for the simultaneous optimization of both individual and system objectives. Second, the Shapley value is introduced to compute each vehicle's marginal utility within the system, allowing its varying impact to be quantified. Finally, the HDV preference estimation is dynamically refined by continuously comparing the observed HDV behavior with the APG's estimated actions, leading to improvements in overall system safety and efficiency. Ablation studies demonstrate that adaptively updating Shapley values and HDV preference estimation significantly improve cooperation success rates in mixed traffic. Comparative experiments further highlight the APG's advantages in terms of safety and efficiency over other cooperative methods. Moreover, the applicability of the approach to real-world scenarios was validated through field tests.
FLU-DYNDec 25, 2024Code
Implicit factorized transformer approach to fast prediction of turbulent channel flowsHuiyu Yang, Yunpeng Wang, Jianchun Wang
Transformer neural operators have recently become an effective approach for surrogate modeling of systems governed by partial differential equations (PDEs). In this paper, we introduce a modified implicit factorized transformer (IFactFormer-m) model which replaces the original chained factorized attention with parallel factorized attention. The IFactFormer-m model successfully performs long-term predictions for turbulent channel flow, whereas the original IFactFormer (IFactFormer-o), Fourier neural operator (FNO), and implicit Fourier neural operator (IFNO) exhibit a poor performance. Turbulent channel flows are simulated by direct numerical simulation using fine grids at friction Reynolds numbers $\text{Re}_τ\approx 180,395,590$, and filtered to coarse grids for training neural operator. The neural operator takes the current flow field as input and predicts the flow field at the next time step, and long-term prediction is achieved in the posterior through an autoregressive approach. The results show that IFactFormer-m, compared to other neural operators and the traditional large eddy simulation (LES) methods including dynamic Smagorinsky model (DSM) and the wall-adapted local eddy-viscosity (WALE) model, reduces prediction errors in the short term, and achieves stable and accurate long-term prediction of various statistical properties and flow structures, including the energy spectrum, mean streamwise velocity, root mean square (rms) values of fluctuating velocities, Reynolds shear stress, and spatial structures of instantaneous velocity. Moreover, the trained IFactFormer-m is much faster than traditional LES methods. By analyzing the attention kernels, we elucidate the reasons why IFactFormer-m converges faster and achieves a stable and accurate long-term prediction compared to IFactFormer-o. Code and data are available at: https://github.com/huiyu-2002/IFactFormer-m.
CVOct 28, 2020Code
AbdomenCT-1K: Is Abdominal Organ Segmentation A Solved Problem?Jun Ma, Yao Zhang, Song Gu et al.
With the unprecedented developments in deep learning, automatic segmentation of main abdominal organs seems to be a solved problem as state-of-the-art (SOTA) methods have achieved comparable results with inter-rater variability on many benchmark datasets. However, most of the existing abdominal datasets only contain single-center, single-phase, single-vendor, or single-disease cases, and it is unclear whether the excellent performance can generalize on diverse datasets. This paper presents a large and diverse abdominal CT organ segmentation dataset, termed AbdomenCT-1K, with more than 1000 (1K) CT scans from 12 medical centers, including multi-phase, multi-vendor, and multi-disease cases. Furthermore, we conduct a large-scale study for liver, kidney, spleen, and pancreas segmentation and reveal the unsolved segmentation problems of the SOTA methods, such as the limited generalization ability on distinct medical centers, phases, and unseen diseases. To advance the unsolved problems, we further build four organ segmentation benchmarks for fully supervised, semi-supervised, weakly supervised, and continual learning, which are currently challenging and active research topics. Accordingly, we develop a simple and effective method for each benchmark, which can be used as out-of-the-box methods and strong baselines. We believe the AbdomenCT-1K dataset will promote future in-depth research towards clinical applicable abdominal organ segmentation methods. The datasets, codes, and trained models are publicly available at https://github.com/JunMa11/AbdomenCT-1K.
IVApr 30
RETO: A Rotary-Enhanced Transformer Operator for High-Fidelity Prediction of Automotive AerodynamicsBojun Zhang, Huiyu Yang, Yunpeng Wang et al.
Rapid aerodynamic evaluation is crucial for modern vehicle design, yet existing neural operators struggle to capture intricate spatial correlations. We propose the rotary-enhanced transformer operator (RETO), a novel neural solver featuring a dual-stage spatial awareness mechanism: sinusoidal-cosine encodings for global referencing and rotary positional encodings (RoPE) for relative displacements. RoPE encodes spatial relations via unitary rotations, enforcing translation invariance and enhancing local gradient resolution. RETO is validated on ShapeNet and the high-fidelity DrivAerML benchmark. On ShapeNet, RETO achieves a relative $L_2$ error of 0.063, outperforming RegDGCNN at 0.125 and representing a 16\% improvement over the Transolver baseline, which yields an error of 0.075. These performance gains are further amplified on the DrivAerML dataset, where RETO achieves relative $L_2$ errors of 0.089 for surface pressure and 0.097 for velocity. In comparison, Transolver results in errors of 0.116 and 0.121 for the same metrics, indicating that RETO achieves precision enhancements of 23\% and 19\%, respectively. For comprehensive comparison, the surface pressure and velocity errors for AB-UBT are 0.102 and 0.124, while RegDGCNN yields 0.235 and 0.312, respectively. Information-theoretical analysis shows that the entropy peak of RETO at 0.35 is significantly lower than that of Transolver at 0.75 under $10^4$ resolution, indicating a focused attentional mechanism capable of preserving localized gradients against global diffusion.
CVFeb 4, 2024
AI-Generated Content Enhanced Computer-Aided Diagnosis Model for Thyroid Nodules: A ChatGPT-Style AssistantJincao Yao, Yunpeng Wang, Zhikai Lei et al.
An artificial intelligence-generated content-enhanced computer-aided diagnosis (AIGC-CAD) model, designated as ThyGPT, has been developed. This model, inspired by the architecture of ChatGPT, could assist radiologists in assessing the risk of thyroid nodules through semantic-level human-machine interaction. A dataset comprising 19,165 thyroid nodule ultrasound cases from Zhejiang Cancer Hospital was assembled to facilitate the training and validation of the model. After training, ThyGPT could automatically evaluate thyroid nodule and engage in effective communication with physicians through human-computer interaction. The performance of ThyGPT was rigorously quantified using established metrics such as the receiver operating characteristic (ROC) curve, area under the curve (AUC), sensitivity, and specificity. The empirical findings revealed that radiologists, when supplemented with ThyGPT, markedly surpassed the diagnostic acumen of their peers utilizing traditional methods as well as the performance of the model in isolation. These findings suggest that AIGC-CAD systems, exemplified by ThyGPT, hold the promise to fundamentally transform the diagnostic workflows of radiologists in forthcoming years.
ROMar 31, 2022
TrajGen: Generating Realistic and Diverse Trajectories with Reactive and Feasible Agent Behaviors for Autonomous DrivingQichao Zhang, Yinfeng Gao, Yikang Zhang et al.
Realistic and diverse simulation scenarios with reactive and feasible agent behaviors can be used for validation and verification of self-driving system performance without relying on expensive and time-consuming real-world testing. Existing simulators rely on heuristic-based behavior models for background vehicles, which cannot capture the complex interactive behaviors in real-world scenarios. To bridge the gap between simulation and the real world, we propose TrajGen, a two-stage trajectory generation framework, which can capture more realistic behaviors directly from human demonstration. In particular, TrajGen consists of the multi-modal trajectory prediction stage and the reinforcement learning based trajectory modification stage. In the first stage, we propose a novel auxiliary RouteLoss for the trajectory prediction model to generate multi-modal diverse trajectories in the drivable area. In the second stage, reinforcement learning is used to track the predicted trajectories while avoiding collisions, which can improve the feasibility of generated trajectories. In addition, we develop a data-driven simulator I-Sim that can be used to train reinforcement learning models in parallel based on naturalistic driving data. The vehicle model in I-Sim can guarantee that the generated trajectories by TrajGen satisfy vehicle kinematic constraints. Finally, we give comprehensive metrics to evaluate generated trajectories for simulation scenarios, which shows that TrajGen outperforms either trajectory prediction or inverse reinforcement learning in terms of fidelity, reactivity, feasibility, and diversity.
IVNov 13, 2020
Metastatic Cancer Image Classification Based On Deep Learning MethodGuanwen Qiu, Xiaobing Yu, Baolin Sun et al.
Using histopathological images to automatically classify cancer is a difficult task for accurately detecting cancer, especially to identify metastatic cancer in small image patches obtained from larger digital pathology scans. Computer diagnosis technology has attracted wide attention from researchers. In this paper, we propose a noval method which combines the deep learning algorithm in image classification, the DenseNet169 framework and Rectified Adam optimization algorithm. The connectivity pattern of DenseNet is direct connections from any layer to all consecutive layers, which can effectively improve the information flow between different layers. With the fact that RAdam is not easy to fall into a local optimal solution, and it can converge quickly in model training. The experimental results shows that our model achieves superior performance over the other classical convolutional neural networks approaches, such as Vgg19, Resnet34, Resnet50. In particular, the Auc-Roc score of our DenseNet169 model is 1.77% higher than Vgg19 model, and the Accuracy score is 1.50% higher. Moreover, we also study the relationship between loss value and batches processed during the training stage and validation stage, and obtain some important and interesting findings.
CRApr 12, 2020
SFE-GACN: A Novel Unknown Attack Detection Method Using Intra Categories Generation in Embedding SpaceAo Liu, Yunpeng Wang, Tao Li
In the encrypted network traffic intrusion detection, deep learning based schemes have attracted lots of attention. However, in real-world scenarios, data is often insufficient (few-shot), which leads to various deviations between the models prediction and the ground truth. Consequently, downstream tasks such as unknown attack detection based on few-shot will be limited by insufficient data. In this paper, we propose a novel unknown attack detection method based on Intra Categories Generation in Embedding Space, namely SFE-GACN, which might be the solution of few-shot problem. Concretely, we first proposed Session Feature Embedding (SFE) to summarize the context of sessions (session is the basic granularity of network traffic), bring the insufficient data to the pre-trained embedding space. In this way, we achieve the goal of preliminary information extension in the few-shot case. Second, we further propose the Generative Adversarial Cooperative Network (GACN), which improves the conventional Generative Adversarial Network by supervising the generated sample to avoid falling into similar categories, and thus enables samples to generate intra categories. Our proposed SFE-GACN can accurately generate session samples in the case of few-shot, and ensure the difference between categories during data augmentation. The detection results show that, compared to the state-of-the-art method, the average TPR is 8.38% higher, and the average FPR is 12.77% lower. In addition, we evaluated the graphics generation capabilities of GACN on the graphics dataset, the result shows our proposed GACN can be popularized for generating easy-confused multi-categories graphics.
LGMay 7, 2017
Spatiotemporal Recurrent Convolutional Networks for Traffic Prediction in Transportation NetworksHaiyang Yu, Zhihai Wu, Shuqin Wang et al.
Predicting large-scale transportation network traffic has become an important and challenging topic in recent decades. Inspired by the domain knowledge of motion prediction, in which the future motion of an object can be predicted based on previous scenes, we propose a network grid representation method that can retain the fine-scale structure of a transportation network. Network-wide traffic speeds are converted into a series of static images and input into a novel deep architecture, namely, spatiotemporal recurrent convolutional networks (SRCNs), for traffic forecasting. The proposed SRCNs inherit the advantages of deep convolutional neural networks (DCNNs) and long short-term memory (LSTM) neural networks. The spatial dependencies of network-wide traffic can be captured by DCNNs, and the temporal dynamics can be learned by LSTMs. An experiment on a Beijing transportation network with 278 links demonstrates that SRCNs outperform other deep learning-based algorithms in both short-term and long-term traffic prediction.
ITMar 22, 2017
Hardware Trojan Detection Game: A Prospect-Theoretic ApproachWalid Saad, Anibal Sanjab, Yunpeng Wang et al.
Outsourcing integrated circuit (IC) manufacturing to offshore foundries has grown exponentially in recent years. Given the critical role of ICs in the control and operation of vehicular systems and other modern engineering designs, such offshore outsourcing has led to serious security threats due to the potential of insertion of hardware trojans - malicious designs that, when activated, can lead to highly detrimental consequences. In this paper, a novel game-theoretic framework is proposed to analyze the interactions between a hardware manufacturer, acting as attacker, and an IC testing facility, acting as defender. The problem is formulated as a noncooperative game in which the attacker must decide on the type of trojan that it inserts while taking into account the detection penalty as well as the damage caused by the trojan. Meanwhile, the resource-constrained defender must decide on the best testing strategy that allows optimizing its overall utility which accounts for both damages and the fines. The proposed game is based on the robust behavioral framework of prospect theory (PT) which allows capturing the potential uncertainty, risk, and irrational behavior in the decision making of both the attacker and defender. For both, the standard rational expected utility (EUT) case and the PT case, a novel algorithm based on fictitious play is proposed and shown to converge to a mixed-strategy Nash equilibrium. For an illustrative case study, thorough analytical results are derived for both EUT and PT to study the properties of the reached equilibrium as well as the impact of key system parameters such as the defender-set fine. Simulation results assess the performance of the proposed framework under both EUT and PT and show that the use of PT will provide invaluable insights on the outcomes of the proposed hardware trojan game, in particular, and system security, in general.
LGJan 16, 2017
Learning Traffic as Images: A Deep Convolutional Neural Network for Large-Scale Transportation Network Speed PredictionXiaolei Ma, Zhuang Dai, Zhengbing He et al.
This paper proposes a convolutional neural network (CNN)-based method that learns traffic as images and predicts large-scale, network-wide traffic speed with a high accuracy. Spatiotemporal traffic dynamics are converted to images describing the time and space relations of traffic flow via a two-dimensional time-space matrix. A CNN is applied to the image following two consecutive steps: abstract traffic feature extraction and network-wide traffic speed prediction. The effectiveness of the proposed method is evaluated by taking two real-world transportation networks, the second ring road and north-east transportation network in Beijing, as examples, and comparing the method with four prevailing algorithms, namely, ordinary least squares, k-nearest neighbors, artificial neural network, and random forest, and three deep learning architectures, namely, stacked autoencoder, recurrent neural network, and long-short-term memory network. The results show that the proposed method outperforms other algorithms by an average accuracy improvement of 42.91% within an acceptable execution time. The CNN can train the model in a reasonable time and, thus, is suitable for large-scale transportation networks.