Miao Ye

NI
h-index20
14papers
90citations
Novelty46%
AI Score49

14 Papers

LGDec 26, 2022Code
A Novel Self-Supervised Learning-Based Anomaly Node Detection Method Based on an Autoencoder in Wireless Sensor Networks

Miao Ye, Qinghao Zhang, Xingsi Xue et al.

Due to the issue that existing wireless sensor network (WSN)-based anomaly detection methods only consider and analyze temporal features, in this paper, a self-supervised learning-based anomaly node detection method based on an autoencoder is designed. This method integrates temporal WSN data flow feature extraction, spatial position feature extraction and intermodal WSN correlation feature extraction into the design of the autoencoder to make full use of the spatial and temporal information of the WSN for anomaly detection. First, a fully connected network is used to extract the temporal features of nodes by considering a single mode from a local spatial perspective. Second, a graph neural network (GNN) is used to introduce the WSN topology from a global spatial perspective for anomaly detection and extract the spatial and temporal features of the data flows of nodes and their neighbors by considering a single mode. Then, the adaptive fusion method involving weighted summation is used to extract the relevant features between different models. In addition, this paper introduces a gated recurrent unit (GRU) to solve the long-term dependence problem of the time dimension. Eventually, the reconstructed output of the decoder and the hidden layer representation of the autoencoder are fed into a fully connected network to calculate the anomaly probability of the current system. Since the spatial feature extraction operation is advanced, the designed method can be applied to the task of large-scale network anomaly detection by adding a clustering operation. Experiments show that the designed method outperforms the baselines, and the F1 score reaches 90.6%, which is 5.2% higher than those of the existing anomaly detection methods based on unsupervised reconstruction and prediction. Code and model are available at https://github.com/GuetYe/anomaly_detection/GLSL

NIJul 31, 2022
DRL-M4MR: An Intelligent Multicast Routing Approach Based on DQN Deep Reinforcement Learning in SDN

Chenwei Zhao, Miao Ye, Xingsi Xue et al.

Traditional multicast routing methods have some problems in constructing a multicast tree, such as limited access to network state information, poor adaptability to dynamic and complex changes in the network, and inflexible data forwarding. To address these defects, the optimal multicast routing problem in software-defined networking (SDN) is tailored as a multi-objective optimization problem, and an intelligent multicast routing algorithm DRL-M4MR based on the deep Q network (DQN) deep reinforcement learning (DRL) method is designed to construct a multicast tree in SDN. First, the multicast tree state matrix, link bandwidth matrix, link delay matrix, and link packet loss rate matrix are designed as the state space of the DRL agent by combining the global view and control of the SDN. Second, the action space of the agent is all the links in the network, and the action selection strategy is designed to add the links to the current multicast tree under four cases. Third, single-step and final reward function forms are designed to guide the intelligence to make decisions to construct the optimal multicast tree. The experimental results show that, compared with existing algorithms, the multicast tree construct by DRL-M4MR can obtain better bandwidth, delay, and packet loss rate performance after training, and it can make more intelligent multicast routing decisions in a dynamic network environment.

17.3NIApr 27
A method for detecting spatio-temporal correlation anomalies of WSN nodes based on topological information enhancement and time-frequency feature extraction

Miao Ye, Ziheng Wang, Qiuxiang Jiang et al.

Existing anomaly detection methods for Wireless Sensor Networks (WSNs) generally suffer from insufficient extraction of spatio-temporal correlation features, reliance on either timedomain or frequencydomain information alone, and high computational overhead. To address these limitations, this paper proposes a topology-enhanced spatio-temporal feature fusion anomaly detection method, TE-MSTAD. First, building upon the RWKV model with linear attention mechanisms, a Cross modal Feature Extraction (CFE) module is introduced to fully extract spatial correlation features among multiple nodes while reducing computational resource consumption. Second, a strategy is designed to construct an adjacency matrix by jointly learning spatial correlation from time-frequency domain features. Different graph neural networks are integrated to enhance spatial correlation feature extraction, thereby fully capturing spatial relationships among multiple nodes. Finally, a dualbranch network TE-MSTAD is designed for time-frequency domain feature fusion, overcoming the limitations of relying solely on the time or frequency domain to improve WSN anomaly detection performance. Testing on both public and realworld datasets demonstrates that the TE-MSTAD model achieves F1 scores of 92.52% and 93.28%, respectively, exhibiting superior detection performance and generalization capabilities compared to existing methods.

DCDec 27, 2025
Role-Based Fault Tolerance System for LLM RL Post-Training

Zhenqian Chen, Baoquan Zhong, Xiang Li et al.

RL post-training for LLMs has been widely scaled to enhance reasoning and tool-using capabilities. However, RL post-training interleaves training and inference workloads, exposing the system to faults from both sides. Existing fault tolerance frameworks for LLMs target either training or inference, leaving the optimization potential in the asynchronous execution unexplored for RL. Our key insight is role-based fault isolation so the failure in one machine does not affect the others. We treat trainer, rollout, and other management roles in RL training as distinct distributed sub-tasks. Instead of restarting the entire RL task in ByteRobust, we recover only the failed role and reconnect it to living ones, thereby eliminating the full-restart overhead including rollout replay and initialization delay. We present RobustRL, the first comprehensive robust system to handle GPU machine errors for RL post-training Effective Training Time Ratio improvement. (1) \textit{Detect}. We implement role-aware monitoring to distinguish actual failures from role-specific behaviors to avoid the false positive and delayed detection. (2) \textit{Restart}. For trainers, we implement a non-disruptive recovery where rollouts persist state and continue trajectory generation, while the trainer is rapidly restored via rollout warm standbys. For rollout, we perform isolated machine replacement without interrupting the RL task. (3) \textit{Reconnect}. We replace static collective communication with dynamic, UCX-based (Unified Communication X) point-to-point communication, enabling immediate weight synchronization between recovered roles. In an RL training task on a 256-GPU cluster with Qwen3-8B-Math workload under 10\% failure injection frequency, RobustRL can achieve an ETTR of over 80\% compared with the 60\% in ByteRobust and achieves 8.4\%-17.4\% faster in end-to-end training time.

46.2NIApr 23
An Overlay Multicast Routing Method Based on Network Situational Awareness and Hierarchical Multi-Agent Reinforcement Learning

Miao Ye, Yanye Chen, Yong Wang et al.

Compared with IP multicast, Overlay Multicast (OM) offers better compatibility and flexible deployment in heterogeneous, cross-domain networks. However, traditional OM struggles to adapt to dynamic traffic due to unawareness of physical resource states, and existing reinforcement learning methods fail to decouple OM's tightly coupled multi-objective nature, leading to high complexity, slow convergence, and instability. To address this, we propose MA-DHRL-OM, a multi-agent deep hierarchical reinforcement learning approach. Using SDN's global view, it builds a traffic-aware model for OM path planning. The method decomposes OM tree construction into two stages via hierarchical agents, reducing action space and improving convergence stability. Multi-agent collaboration balances multi-objective optimization while enhancing scalability and adaptability. Experiments show MA-DHRL-OM outperforms existing methods in delay, bandwidth utilization, and packet loss, with more stable convergence and flexible routing.

NIAug 27, 2024
MA-CDMR: An Intelligent Cross-domain Multicast Routing Method based on Multiagent Deep Reinforcement Learning in Multi-domain SDWN

Miao Ye, Hongwen Hu, Xiaoli Wang et al.

The cross-domain multicast routing problem in a software-defined wireless network with multiple controllers is a classic NP-hard optimization problem. As the network size increases, designing and implementing cross-domain multicast routing paths in the network requires not only designing efficient solution algorithms to obtain the optimal cross-domain multicast tree but also ensuring the timely and flexible acquisition and maintenance of global network state information. However, existing solutions have a limited ability to sense the network traffic state, affecting the quality of service of multicast services. In addition, these methods have difficulty adapting to the highly dynamically changing network states and have slow convergence speeds. To this end, this paper aims to design and implement a multiagent deep reinforcement learning based cross-domain multicast routing method for SDWN with multicontroller domains. First, a multicontroller communication mechanism and a multicast group management module are designed to transfer and synchronize network information between different control domains of the SDWN, thus effectively managing the joining and classification of members in the cross-domain multicast group. Second, a theoretical analysis and proof show that the optimal cross-domain multicast tree includes an interdomain multicast tree and an intradomain multicast tree. An agent is established for each controller, and a cooperation mechanism between multiple agents is designed to effectively optimize cross-domain multicast routing and ensure consistency and validity in the representation of network state information for cross-domain multicast routing decisions. Third, a multiagent reinforcement learning-based method that combines online and offline training is designed to reduce the dependence on the real-time environment and increase the convergence speed of multiple agents.

SPFeb 25, 2025
A Novel Spatiotemporal Correlation Anomaly Detection Method Based on Time-Frequency-Domain Feature Fusion and a Dynamic Graph Neural Network in Wireless Sensor Network

Miao Ye, Zhibang Jiang, Xingsi Xue et al.

Attention-based transformers have played an important role in wireless sensor network (WSN) timing anomaly detection due to their ability to capture long-term dependencies. However, there are several issues that must be addressed, such as the fact that their ability to capture long-term dependencies is not completely reliable, their computational complexity levels are high, and the spatiotemporal features of WSN timing data are not sufficiently extracted for detecting the correlation anomalies of multinode WSN timing data. To address these limitations, this paper proposes a WSN anomaly detection method that integrates frequency-domain features with dynamic graph neural networks (GNN) under a designed self-encoder reconstruction framework. First, the discrete wavelet transform effectively decomposes trend and seasonal components of time series to solve the poor long-term reliability of transformers. Second, a frequency-domain attention mechanism is designed to make full use of the difference between the amplitude distributions of normal data and anomalous data in this domain. Finally, a multimodal fusion-based dynamic graph convolutional network (MFDGCN) is designed by combining an attention mechanism and a graph convolutional network (GCN) to adaptively extract spatial correlation features. A series of experiments conducted on public datasets and their results demonstrate that the anomaly detection method designed in this paper exhibits superior precision and recall than the existing methods do, with an F1 score of 93.5%, representing an improvement of 2.9% over that of the existing models.

LGJan 19
A Graph Prompt Fine-Tuning Method for WSN Spatio-Temporal Correlation Anomaly Detection

Miao Ye, Jing Cui, Yuan huang et al.

Anomaly detection of multi-temporal modal data in Wireless Sensor Network (WSN) can provide an important guarantee for reliable network operation. Existing anomaly detection methods in multi-temporal modal data scenarios have the problems of insufficient extraction of spatio-temporal correlation features, high cost of anomaly sample category annotation, and imbalance of anomaly samples. In this paper, a graph neural network anomaly detection backbone network incorporating spatio-temporal correlation features and a multi-task self-supervised training strategy of "pre-training - graph prompting - fine-tuning" are designed for the characteristics of WSN graph structure data. First, the anomaly detection backbone network is designed by improving the Mamba model based on a multi-scale strategy and inter-modal fusion method, and combining it with a variational graph convolution module, which is capable of fully extracting spatio-temporal correlation features in the multi-node, multi-temporal modal scenarios of WSNs. Secondly, we design a three-subtask learning "pre-training" method with no-negative comparative learning, prediction, and reconstruction to learn generic features of WSN data samples from unlabeled data, and design a "graph prompting-fine-tuning" mechanism to guide the pre-trained self-supervised learning. The model is fine-tuned through the "graph prompting-fine-tuning" mechanism to guide the pre-trained self-supervised learning model to complete the parameter fine-tuning, thereby reducing the training cost and enhancing the detection generalization performance. The F1 metrics obtained from experiments on the public dataset and the actual collected dataset are up to 91.30% and 92.31%, respectively, which provides better detection performance and generalization ability than existing methods designed by the method.

LGMay 31, 2025
A New Spatiotemporal Correlation Anomaly Detection Method that Integrates Contrastive Learning and Few-Shot Learning in Wireless Sensor Networks

Miao Ye, Suxiao Wang, Jiaguang Han et al.

Detecting anomalies in the data collected by WSNs can provide crucial evidence for assessing the reliability and stability of WSNs. Existing methods for WSN anomaly detection often face challenges such as the limited extraction of spatiotemporal correlation features, the absence of sample labels, few anomaly samples, and an imbalanced sample distribution. To address these issues, a spatiotemporal correlation detection model (MTAD-RD) considering both model architecture and a two-stage training strategy perspective is proposed. In terms of model structure design, the proposed MTAD-RD backbone network includes a retentive network (RetNet) enhanced by a cross-retention (CR) module, a multigranular feature fusion module, and a graph attention network module to extract internode correlation information. This proposed model can integrate the intermodal correlation features and spatial features of WSN neighbor nodes while extracting global information from time series data. Moreover, its serialized inference characteristic can remarkably reduce inference overhead. For model training, a two-stage training approach was designed. First, a contrastive learning proxy task was designed for time series data with graph structure information in WSNs, enabling the backbone network to learn transferable features from unlabeled data using unsupervised contrastive learning methods, thereby addressing the issue of missing sample labels in the dataset. Then, a caching-based sample sampler was designed to divide samples into few-shot and contrastive learning data. A specific joint loss function was developed to jointly train the dual-graph discriminator network to address the problem of sample imbalance effectively. In experiments carried out on real public datasets, the designed MTAD-RD anomaly detection method achieved an F1 score of 90.97%, outperforming existing supervised WSN anomaly detection methods.

AIMar 21, 2025
A New Segment Routing method with Swap Node Selection Strategy Based on Deep Reinforcement Learning for Software Defined Network

Miao Ye, Jihao Zheng, Qiuxiang Jiang et al.

The existing segment routing (SR) methods need to determine the routing first and then use path segmentation approaches to select swap nodes to form a segment routing path (SRP). They require re-segmentation of the path when the routing changes. Furthermore, they do not consider the flow table issuance time, which cannot maximize the speed of issuance flow table. To address these issues, this paper establishes an optimization model that can simultaneously form routing strategies and path segmentation strategies for selecting the appropriate swap nodes to reduce flow table issuance time. It also designs an intelligent segment routing algorithm based on deep reinforcement learning (DRL-SR) to solve the proposed model. First, a traffic matrix is designed as the state space for the deep reinforcement learning agent; this matrix includes multiple QoS performance indicators, flow table issuance time overhead and SR label stack depth. Second, the action selection strategy and corresponding reward function are designed, where the agent selects the next node considering the routing; in addition, the action selection strategy whether the newly added node is selected as the swap node and the corresponding reward function are designed considering the time cost factor for the controller to issue the flow table to the swap node. Finally, a series of experiments and their results show that, compared with the existing methods, the designed segmented route optimization model and the intelligent solution algorithm (DRL-SR) can reduce the time overhead required to complete the segmented route establishment task while optimizing performance metrics such as throughput, delays and packet losses.

AIMay 30, 2023
DHRL-FNMR: An Intelligent Multicast Routing Approach Based on Deep Hierarchical Reinforcement Learning in SDN

Miao Ye, Chenwei Zhao, Xingsi Xue et al.

The optimal multicast tree problem in the Software-Defined Networking (SDN) multicast routing is an NP-hard combinatorial optimization problem. Although existing SDN intelligent solution methods, which are based on deep reinforcement learning, can dynamically adapt to complex network link state changes, these methods are plagued by problems such as redundant branches, large action space, and slow agent convergence. In this paper, an SDN intelligent multicast routing algorithm based on deep hierarchical reinforcement learning is proposed to circumvent the aforementioned problems. First, the multicast tree construction problem is decomposed into two sub-problems: the fork node selection problem and the construction of the optimal path from the fork node to the destination node. Second, based on the information characteristics of SDN global network perception, the multicast tree state matrix, link bandwidth matrix, link delay matrix, link packet loss rate matrix, and sub-goal matrix are designed as the state space of intrinsic and meta controllers. Then, in order to mitigate the excessive action space, our approach constructs different action spaces at the upper and lower levels. The meta-controller generates an action space using network nodes to select the fork node, and the intrinsic controller uses the adjacent edges of the current node as its action space, thus implementing four different action selection strategies in the construction of the multicast tree. To facilitate the intelligent agent in constructing the optimal multicast tree with greater speed, we developed alternative reward strategies that distinguish between single-step node actions and multi-step actions towards multiple destination nodes.

NIMay 12, 2023
An Intelligent SDWN Routing Algorithm Based on Network Situational Awareness and Deep Reinforcement Learning

Jinqiang Li, Miao Ye, Linqiang Huang et al.

Due to the highly dynamic changes in wireless network topologies, efficiently obtaining network status information and flexibly forwarding data to improve communication quality of service are important challenges. This article introduces an intelligent routing algorithm (DRL-PPONSA) based on proximal policy optimization deep reinforcement learning with network situational awareness under a software-defined wireless networking architecture. First, a specific data plane is designed for network topology construction and data forwarding. The control plane collects network traffic information, sends flow tables, and uses a GCN-GRU prediction mechanism to perceive future traffic change trends to achieve network situational awareness. Second, a DRL-based data forwarding mechanism is designed in the knowledge plane. The predicted network traffic matrix and topology information matrix are treated as the environment for DRL agents, while next-hop adjacent nodes are treated as executable actions. Accordingly, action selection strategies are designed for different network conditions to achieve more intelligent, flexible, and efficient routing control. The reward function is designed using network link information and various reward and penalty mechanisms. Additionally, importance sampling and gradient clipping techniques are employed during gradient updating to enhance convergence speed and stability. Experimental results show that DRL-PPONSA outperforms traditional routing methods in network throughput, delay, packet loss rate, and wireless node distance. Compared to value-function-based Dueling DQN routing, the convergence speed is significantly improved, and the convergence effect is more stable. Simultaneously, its consumption of hardware storage space is reduced, and efficient routing decisions can be made in real-time using the current network state information.

NIMay 12, 2023
Intelligent multicast routing method based on multi-agent deep reinforcement learning in SDWN

Hongwen Hu, Miao Ye, Chenwei Zhao et al.

Multicast communication technology is widely applied in wireless environments with a high device density. Traditional wireless network architectures have difficulty flexibly obtaining and maintaining global network state information and cannot quickly respond to network state changes, thus affecting the throughput, delay, and other QoS requirements of existing multicasting solutions. Therefore, this paper proposes a new multicast routing method based on multiagent deep reinforcement learning (MADRL-MR) in a software-defined wireless networking (SDWN) environment. First, SDWN technology is adopted to flexibly configure the network and obtain network state information in the form of traffic matrices representing global network links information, such as link bandwidth, delay, and packet loss rate. Second, the multicast routing problem is divided into multiple subproblems, which are solved through multiagent cooperation. To enable each agent to accurately understand the current network state and the status of multicast tree construction, the state space of each agent is designed based on the traffic and multicast tree status matrices, and the set of AP nodes in the network is used as the action space. A novel single-hop action strategy is designed, along with a reward function based on the four states that may occur during tree construction: progress, invalid, loop, and termination. Finally, a decentralized training approach is combined with transfer learning to enable each agent to quickly adapt to dynamic network changes and accelerate convergence. Simulation experiments show that MADRL-MR outperforms existing algorithms in terms of throughput, delay, packet loss rate, etc., and can establish more intelligent multicast routes.

LGFeb 19, 2022
A Novel Anomaly Detection Method for Multimodal WSN Data Flow via a Dynamic Graph Neural Network

Qinghao Zhang, Miao Ye, Hongbing Qiu et al.

Anomaly detection is widely used to distinguish system anomalies by analyzing the temporal and spatial features of wireless sensor network (WSN) data streams; it is one of critical technique that ensures the reliability of WSNs. Currently, graph neural networks (GNNs) have become popular state-of-the-art methods for conducting anomaly detection on WSN data streams. However, the existing anomaly detection methods based on GNNs do not consider the temporal and spatial features of WSN data streams simultaneously, such as multi-node, multi-modal and multi-time features, seriously impacting their effectiveness. In this paper, a novel anomaly detection model is proposed for multimodal WSN data flows, where three GNNs are used to separately extract the temporal features of WSN data flows, the correlation features between different modes and the spatial features between sensor node positions. Specifically, first, the temporal features and modal correlation features extracted from each sensor node are fused into one vector representation, which is further aggregated with the spatial features, i.e., the spatial position relationships of the nodes; finally, the current time-series data of WSN nodes are predicted, and abnormal states are identified according to the fusion features. The simulation results obtained on a public dataset show that the proposed approach is able to significantly improve upon the existing methods in terms of its robustness, and its F1 score reaches 0.90, which is 14.2% higher than that of the graph convolution network (GCN) with long short-term memory (LSTM).