Xiaoniu Yang

LG
h-index17
28papers
774citations
Novelty46%
AI Score44

28 Papers

CVOct 5, 2023Code
Can pre-trained models assist in dataset distillation?

Yao Lu, Xuguang Chen, Yuchen Zhang et al. · pku

Dataset Distillation (DD) is a prominent technique that encapsulates knowledge from a large-scale original dataset into a small synthetic dataset for efficient training. Meanwhile, Pre-trained Models (PTMs) function as knowledge repositories, containing extensive information from the original dataset. This naturally raises a question: Can PTMs effectively transfer knowledge to synthetic datasets, guiding DD accurately? To this end, we conduct preliminary experiments, confirming the contribution of PTMs to DD. Afterwards, we systematically study different options in PTMs, including initialization parameters, model architecture, training epoch and domain knowledge, revealing that: 1) Increasing model diversity enhances the performance of synthetic datasets; 2) Sub-optimal models can also assist in DD and outperform well-trained ones in certain cases; 3) Domain-specific PTMs are not mandatory for DD, but a reasonable domain match is crucial. Finally, by selecting optimal options, we significantly improve the cross-architecture generalization over baseline DD methods. We hope our work will facilitate researchers to develop better DD techniques. Our code is available at https://github.com/yaolu-zjut/DDInterpreter.

LGJun 23, 2023Code
PathMLP: Smooth Path Towards High-order Homophily

Jiajun Zhou, Chenxuan Xie, Shengbo Gong et al.

Real-world graphs exhibit increasing heterophily, where nodes no longer tend to be connected to nodes with the same label, challenging the homophily assumption of classical graph neural networks (GNNs) and impeding their performance. Intriguingly, from the observation of heterophilous data, we notice that certain high-order information exhibits higher homophily, which motivates us to involve high-order information in node representation learning. However, common practices in GNNs to acquire high-order information mainly through increasing model depth and altering message-passing mechanisms, which, albeit effective to a certain extent, suffer from three shortcomings: 1) over-smoothing due to excessive model depth and propagation times; 2) high-order information is not fully utilized; 3) low computational efficiency. In this regard, we design a similarity-based path sampling strategy to capture smooth paths containing high-order homophily. Then we propose a lightweight model based on multi-layer perceptrons (MLP), named PathMLP, which can encode messages carried by paths via simple transformation and concatenation operations, and effectively learn node representations in heterophilous graphs through adaptive path aggregation. Extensive experiments demonstrate that our method outperforms baselines on 16 out of 20 datasets, underlining its effectiveness and superiority in alleviating the heterophily problem. In addition, our method is immune to over-smoothing and has high computational efficiency. The source code will be available in https://github.com/Graph4Sec-Team/PathMLP.

LGAug 1, 2024Code
Enhancing Ethereum Fraud Detection via Generative and Contrastive Self-supervision

Chenxiang Jin, Jiajun Zhou, Chenxuan Xie et al.

The rampant fraudulent activities on Ethereum hinder the healthy development of the blockchain ecosystem, necessitating the reinforcement of regulations. However, multiple imbalances involving account interaction frequencies and interaction types in the Ethereum transaction environment pose significant challenges to data mining-based fraud detection research. To address this, we first propose the concept of meta-interactions to refine interaction behaviors in Ethereum, and based on this, we present a dual self-supervision enhanced Ethereum fraud detection framework, named Meta-IFD. This framework initially introduces a generative self-supervision mechanism to augment the interaction features of accounts, followed by a contrastive self-supervision mechanism to differentiate various behavior patterns, and ultimately characterizes the behavioral representations of accounts and mines potential fraud risks through multi-view interaction feature learning. Extensive experiments on real Ethereum datasets demonstrate the effectiveness and superiority of our framework in detecting common Ethereum fraud behaviors such as Ponzi schemes and phishing scams. Additionally, the generative module can effectively alleviate the interaction distribution imbalance in Ethereum data, while the contrastive module significantly enhances the framework's ability to distinguish different behavior patterns. The source code will be available in https://github.com/GISec-Team/Meta-IFD.

LGJun 4, 2023Code
Clarify Confused Nodes via Separated Learning

Jiajun Zhou, Shengbo Gong, Xuanze Chen et al.

Graph neural networks (GNNs) have achieved remarkable advances in graph-oriented tasks. However, real-world graphs invariably contain a certain proportion of heterophilous nodes, challenging the homophily assumption of traditional GNNs and hindering their performance. Most existing studies continue to design generic models with shared weights between heterophilous and homophilous nodes. Despite the incorporation of high-order messages or multi-channel architectures, these efforts often fall short. A minority of studies attempt to train different node groups separately but suffer from inappropriate separation metrics and low efficiency. In this paper, we first propose a new metric, termed Neighborhood Confusion (NC), to facilitate a more reliable separation of nodes. We observe that node groups with different levels of NC values exhibit certain differences in intra-group accuracy and visualized embeddings. These pave the way for Neighborhood Confusion-guided Graph Convolutional Network (NCGCN), in which nodes are grouped by their NC values and accept intra-group weight sharing and message passing. Extensive experiments on both homophilous and heterophilous benchmarks demonstrate that our framework can effectively separate nodes and yield significant performance improvement compared to the latest methods. The source code will be available in https://github.com/GISec-Team/NCGNN.

SPApr 5, 2022
Mixing Signals: Data Augmentation Approach for Deep Learning Based Modulation Recognition

Xinjie Xu, Zhuangzhi Chen, Dongwei Xu et al.

With the rapid development of deep learning, automatic modulation recognition (AMR), as an important task in cognitive radio, has gradually transformed from traditional feature extraction and classification to automatic classification by deep learning technology. However, deep learning models are data-driven methods, which often require a large amount of data as the training support. Data augmentation, as the strategy of expanding dataset, can improve the generalization of the deep learning models and thus improve the accuracy of the models to a certain extent. In this paper, for AMR of radio signals, we propose a data augmentation strategy based on mixing signals and consider four specific methods (Random Mixing, Maximum-Similarity-Mixing, $θ-$Similarity Mixing and n-times Random Mixing) to achieve data augmentation. Experiments show that our proposed method can improve the classification accuracy of deep learning based AMR models in the full public dataset RML2016.10a. In particular, for the case of a single signal-to-noise ratio signal set, the classification accuracy can be significantly improved, which verifies the effectiveness of the methods.

LGNov 7, 2023
Augmenting Radio Signals with Wavelet Transform for Deep Learning-Based Modulation Recognition

Tao Chen, Shilian Zheng, Kunfeng Qiu et al.

The use of deep learning for radio modulation recognition has become prevalent in recent years. This approach automatically extracts high-dimensional features from large datasets, facilitating the accurate classification of modulation schemes. However, in real-world scenarios, it may not be feasible to gather sufficient training data in advance. Data augmentation is a method used to increase the diversity and quantity of training dataset and to reduce data sparsity and imbalance. In this paper, we propose data augmentation methods that involve replacing detail coefficients decomposed by discrete wavelet transform for reconstructing to generate new samples and expand the training set. Different generation methods are used to generate replacement sequences. Simulation results indicate that our proposed methods significantly outperform the other augmentation methods.

LGDec 20, 2022
Data Augmentation on Graphs: A Technical Survey

Jiajun Zhou, Chenxuan Xie, Shengbo Gong et al.

In recent years, graph representation learning has achieved remarkable success while suffering from low-quality data problems. As a mature technology to improve data quality in computer vision, data augmentation has also attracted increasing attention in graph domain. To advance research in this emerging direction, this survey provides a comprehensive review and summary of existing graph data augmentation (GDAug) techniques. Specifically, this survey first provides an overview of various feasible taxonomies and categorizes existing GDAug studies based on multi-scale graph elements. Subsequently, for each type of GDAug technique, this survey formalizes standardized technical definition, discuss the technical details, and provide schematic illustration. The survey also reviews domain-specific graph data augmentation techniques, including those for heterogeneous graphs, temporal graphs, spatio-temporal graphs, and hypergraphs. In addition, this survey provides a summary of available evaluation metrics and design guidelines for graph data augmentation. Lastly, it outlines the applications of GDAug at both the data and model levels, discusses open issues in the field, and looks forward to future directions. The latest advances in GDAug are summarized in GitHub.

CRAug 17, 2023
AIR: Threats of Adversarial Attacks on Deep Learning-Based Information Recovery

Jinyin Chen, Jie Ge, Shilian Zheng et al.

A wireless communications system usually consists of a transmitter which transmits the information and a receiver which recovers the original information from the received distorted signal. Deep learning (DL) has been used to improve the performance of the receiver in complicated channel environments and state-of-the-art (SOTA) performance has been achieved. However, its robustness has not been investigated. In order to evaluate the robustness of DL-based information recovery models under adversarial circumstances, we investigate adversarial attacks on the SOTA DL-based information recovery model, i.e., DeepReceiver. We formulate the problem as an optimization problem with power and peak-to-average power ratio (PAPR) constraints. We design different adversarial attack methods according to the adversary's knowledge of DeepReceiver's model and/or testing samples. Extensive experiments show that the DeepReceiver is vulnerable to the designed attack methods in all of the considered scenarios. Even in the scenario of both model and test sample restricted, the adversary can attack the DeepReceiver and increase its bit error rate (BER) above 10%. It can also be found that the DeepReceiver is vulnerable to adversarial perturbations even with very low power and limited PAPR. These results suggest that defense measures should be taken to enhance the robustness of DeepReceiver.

SPNov 8, 2023
Deep Learning-Based Frequency Offset Estimation

Tao Chen, Shilian Zheng, Jiawei Zhu et al.

In wireless communication systems, the asynchronization of the oscillators in the transmitter and the receiver along with the Doppler shift due to relative movement may lead to the presence of carrier frequency offset (CFO) in the received signals. Estimation of CFO is crucial for subsequent processing such as coherent demodulation. In this brief, we demonstrate the utilization of deep learning for CFO estimation by employing a residual network (ResNet) to learn and extract signal features from the raw in-phase (I) and quadrature (Q) components of the signals. We use multiple modulation schemes in the training set to make the trained model adaptable to multiple modulations or even new signals. In comparison to the commonly used traditional CFO estimation methods, our proposed IQ-ResNet method exhibits superior performance across various scenarios including different oversampling ratios, various signal lengths, and different channels

CVNov 6, 2023
Deep Image Semantic Communication Model for Artificial Intelligent Internet of Things

Li Ping Qian, Yi Zhang, Sikai Lyu et al.

With the rapid development of Artificial Intelligent Internet of Things (AIoT), the image data from AIoT devices has been witnessing the explosive increasing. In this paper, a novel deep image semantic communication model is proposed for the efficient image communication in AIoT. Particularly, at the transmitter side, a high-precision image semantic segmentation algorithm is proposed to extract the semantic information of the image to achieve significant compression of the image data. At the receiver side, a semantic image restoration algorithm based on Generative Adversarial Network (GAN) is proposed to convert the semantic image to a real scene image with detailed information. Simulation results demonstrate that the proposed image semantic communication model can improve the image compression ratio and recovery accuracy by 71.93% and 25.07% on average in comparison with WebP and CycleGAN, respectively. More importantly, our demo experiment shows that the proposed model reduces the total delay by 95.26% in the image communication, when comparing with the original image transmission.

LGJun 6, 2023
Subgraph Networks Based Contrastive Learning

Jinhuan Wang, Jiafei Shao, Zeyu Wang et al.

Graph contrastive learning (GCL), as a self-supervised learning method, can solve the problem of annotated data scarcity. It mines explicit features in unannotated graphs to generate favorable graph representations for downstream tasks. Most existing GCL methods focus on the design of graph augmentation strategies and mutual information estimation operations. Graph augmentation produces augmented views by graph perturbations. These views preserve a locally similar structure and exploit explicit features. However, these methods have not considered the interaction existing in subgraphs. To explore the impact of substructure interactions on graph representations, we propose a novel framework called subgraph network-based contrastive learning (SGNCL). SGNCL applies a subgraph network generation strategy to produce augmented views. This strategy converts the original graph into an Edge-to-Node mapping network with both topological and attribute features. The single-shot augmented view is a first-order subgraph network that mines the interaction between nodes, node-edge, and edges. In addition, we also investigate the impact of the second-order subgraph augmentation on mining graph structure interactions, and further, propose a contrastive objective that fuses the first-order and second-order subgraph information. We compare SGNCL with classical and state-of-the-art graph contrastive learning methods on multiple benchmark datasets of different domains. Extensive experiments show that SGNCL achieves competitive or better performance (top three) on all datasets in unsupervised learning settings. Furthermore, SGNCL achieves the best average gain of 6.9\% in transfer learning compared to the best method. Finally, experiments also demonstrate that mining substructure interactions have positive implications for graph contrastive learning.

LGAug 5, 2024
MDM: Advancing Multi-Domain Distribution Matching for Automatic Modulation Recognition Dataset Synthesis

Dongwei Xu, Jiajun Chen, Yao Lu et al.

Recently, deep learning technology has been successfully introduced into Automatic Modulation Recognition (AMR) tasks. However, the success of deep learning is all attributed to the training on large-scale datasets. Such a large amount of data brings huge pressure on storage, transmission and model training. In order to solve the problem of large amount of data, some researchers put forward the method of data distillation, which aims to compress large training data into smaller synthetic datasets to maintain its performance. While numerous data distillation techniques have been developed within the realm of image processing, the unique characteristics of signals set them apart. Signals exhibit distinct features across various domains, necessitating specialized approaches for their analysis and processing. To this end, a novel dataset distillation method--Multi-domain Distribution Matching (MDM) is proposed. MDM employs the Discrete Fourier Transform (DFT) to translate timedomain signals into the frequency domain, and then uses a model to compute distribution matching losses between the synthetic and real datasets, considering both the time and frequency domains. Ultimately, these two losses are integrated to update the synthetic dataset. We conduct extensive experiments on three AMR datasets. Experimental results show that, compared with baseline methods, our method achieves better performance under the same compression ratio. Furthermore, we conduct crossarchitecture generalization experiments on several models, and the experimental results show that our synthetic datasets can generalize well on other unseen models.

LGNov 23, 2024Code
Reassessing Layer Pruning in LLMs: New Insights and Methods

Yao Lu, Hao Cheng, Yujie Fang et al.

Although large language models (LLMs) have achieved remarkable success across various domains, their considerable scale necessitates substantial computational resources, posing significant challenges for deployment in resource-constrained environments. Layer pruning, as a simple yet effective compression method, removes layers of a model directly, reducing computational overhead. However, what are the best practices for layer pruning in LLMs? Are sophisticated layer selection metrics truly effective? Does the LoRA (Low-Rank Approximation) family, widely regarded as a leading method for pruned model fine-tuning, truly meet expectations when applied to post-pruning fine-tuning? To answer these questions, we dedicate thousands of GPU hours to benchmarking layer pruning in LLMs and gaining insights across multiple dimensions. Our results demonstrate that a simple approach, i.e., pruning the final 25\% of layers followed by fine-tuning the \texttt{lm\_head} and the remaining last three layer, yields remarkably strong performance. Following this guide, we prune Llama-3.1-8B-It and obtain a model that outperforms many popular LLMs of similar size, such as ChatGLM2-6B, Vicuna-7B-v1.5, Qwen1.5-7B and Baichuan2-7B. We release the optimal model weights on Huggingface, and the code is available on GitHub.

45.2SPApr 20
Deep Learning for Multi-Antenna Modulation Recognition of Radio Signals

Tao Chen, Shilian Zheng, Jiepeng Chen et al.

Multi-antenna receiving systems have become a prevalent technical solution in communication systems. Meanwhile, deep learning has achieved significant progress in automatic modulation recognition tasks in single-antenna systems. However, the application of deep learning in multi-antenna modulation recognition (MAMR) tasks is still limited. In this paper, we propose an MAMR method namely MAMR-IQ to fully explore the diversity gain of a multi-antenna receiving system, which concatenates the raw received in-phase and quadrature (IQ) signals of multiple antennas and feeds them into a convolutional neural network. Simulation results show that the proposed MAMR-IQ method outperforms two existing deep learning-based MAMR methods which are based on direct voting (DV) and weight average (WA) in terms of both recognition accuracy and computational complexity. To address the problem of limited training data in few-shot scenarios, we further propose a data augmentation method that involves exchanging IQ sequences received by any two antennas to generate augmented samples. Simulation results show that with the proposed augmentation method, the recognition accuracy can be further improved.

CVNov 24, 2021Code
Understanding the Dynamics of DNNs Using Graph Modularity

Yao Lu, Wen Yang, Yunzhe Zhang et al.

There are good arguments to support the claim that deep neural networks (DNNs) capture better feature representations than the previous hand-crafted feature engineering, which leads to a significant performance improvement. In this paper, we move a tiny step towards understanding the dynamics of feature representations over layers. Specifically, we model the process of class separation of intermediate representations in pre-trained DNNs as the evolution of communities in dynamic graphs. Then, we introduce modularity, a generic metric in graph theory, to quantify the evolution of communities. In the preliminary experiment, we find that modularity roughly tends to increase as the layer goes deeper and the degradation and plateau arise when the model complexity is great relative to the dataset. Through an asymptotic analysis, we prove that modularity can be broadly used for different applications. For example, modularity provides new insights to quantify the difference between feature representations. More crucially, we demonstrate that the degradation and plateau in modularity curves represent redundant layers in DNNs and can be pruned with minimal impact on performance, which provides theoretical guidance for layer pruning. Our code is available at https://github.com/yaolu-zjut/Dynamic-Graphs-Construction.

LGNov 15, 2024
RedTest: Towards Measuring Redundancy in Deep Neural Networks Effectively

Yao Lu, Peixin Zhang, Jingyi Wang et al.

Deep learning has revolutionized computing in many real-world applications, arguably due to its remarkable performance and extreme convenience as an end-to-end solution. However, deep learning models can be costly to train and to use, especially for those large-scale models, making it necessary to optimize the original overly complicated models into smaller ones in scenarios with limited resources such as mobile applications or simply for resource saving. The key question in such model optimization is, how can we effectively identify and measure the redundancy in a deep learning model structure. While several common metrics exist in the popular model optimization techniques to measure the performance of models after optimization, they are not able to quantitatively inform the degree of remaining redundancy. To address the problem, we present a novel testing approach, i.e., RedTest, which proposes a novel testing metric called Model Structural Redundancy Score (MSRS) to quantitatively measure the degree of redundancy in a deep learning model structure. We first show that MSRS is effective in both revealing and assessing the redundancy issues in many state-of-the-art models, which urgently calls for model optimization. Then, we utilize MSRS to assist deep learning model developers in two practical application scenarios: 1) in Neural Architecture Search, we design a novel redundancy-aware algorithm to guide the search for the optimal model structure and demonstrate its effectiveness by comparing it to existing standard NAS practice; 2) in the pruning of large-scale pre-trained models, we prune the redundant layers of pre-trained models with the guidance of layer similarity to derive less redundant ones of much smaller size. Extensive experimental results demonstrate that removing such redundancy has a negligible effect on the model utility.

CRNov 15, 2024
Lateral Movement Detection via Time-aware Subgraph Classification on Authentication Logs

Jiajun Zhou, Jiacheng Yao, Xuanze Chen et al.

Lateral movement is a crucial component of advanced persistent threat (APT) attacks in networks. Attackers exploit security vulnerabilities in internal networks or IoT devices, expanding their control after initial infiltration to steal sensitive data or carry out other malicious activities, posing a serious threat to system security. Existing research suggests that attackers generally employ seemingly unrelated operations to mask their malicious intentions, thereby evading existing lateral movement detection methods and hiding their intrusion traces. In this regard, we analyze host authentication log data from a graph perspective and propose a multi-scale lateral movement detection framework called LMDetect. The main workflow of this framework proceeds as follows: 1) Construct a heterogeneous multigraph from host authentication log data to strengthen the correlations among internal system entities; 2) Design a time-aware subgraph generator to extract subgraphs centered on authentication events from the heterogeneous authentication multigraph; 3) Design a multi-scale attention encoder that leverages both local and global attention to capture hidden anomalous behavior patterns in the authentication subgraphs, thereby achieving lateral movement detection. Extensive experiments on two real-world authentication log datasets demonstrate the effectiveness and superiority of our framework in detecting lateral movement behaviors.

LGOct 15, 2024
Rethinking Graph Transformer Architecture Design for Node Classification

Jiajun Zhou, Xuanze Chen, Chenxuan Xie et al.

Graph Transformer (GT), as a special type of Graph Neural Networks (GNNs), utilizes multi-head attention to facilitate high-order message passing. However, this also imposes several limitations in node classification applications: 1) nodes are susceptible to global noise; 2) self-attention computation cannot scale well to large graphs. In this work, we conduct extensive observational experiments to explore the adaptability of the GT architecture in node classification tasks and draw several conclusions: the current multi-head self-attention module in GT can be completely replaceable, while the feed-forward neural network module proves to be valuable. Based on this, we decouple the propagation (P) and transformation (T) of GNNs and explore a powerful GT architecture, named GNNFormer, which is based on the P/T combination message passing and adapted for node classification in both homophilous and heterophilous scenarios. Extensive experiments on 12 benchmark datasets demonstrate that our proposed GT architecture can effectively adapt to node classification tasks without being affected by global noise and computational efficiency limitations.

LGJun 12, 2024
A Generic Layer Pruning Method for Signal Modulation Recognition Deep Learning Models

Yao Lu, Yutao Zhu, Yuqi Li et al.

With the successful application of deep learning in communications systems, deep neural networks are becoming the preferred method for signal classification. Although these models yield impressive results, they often come with high computational complexity and large model sizes, which hinders their practical deployment in communication systems. To address this challenge, we propose a novel layer pruning method. Specifically, we decompose the model into several consecutive blocks, each containing consecutive layers with similar semantics. Then, we identify layers that need to be preserved within each block based on their contribution. Finally, we reassemble the pruned blocks and fine-tune the compact model. Extensive experiments on five datasets demonstrate the efficiency and effectiveness of our method over a variety of state-of-the-art baselines, including layer pruning and channel pruning methods.

LGNov 22, 2021
Graph-Based Similarity of Neural Network Representations

Zuohui Chen, Yao Lu, Jinxuan Hu et al.

Understanding the black-box representations in Deep Neural Networks (DNN) is an essential problem in deep learning. In this work, we propose Graph-Based Similarity (GBS) to measure the similarity of layer features. Contrary to previous works that compute the similarity directly on the feature maps, GBS measures the correlation based on the graph constructed with hidden layer outputs. By treating each input sample as a node and the corresponding layer output similarity as edges, we construct the graph of DNN representations for each layer. The similarity between graphs of layers identifies the correspondences between representations of models trained in different datasets and initializations. We demonstrate and prove the invariance property of GBS, including invariance to orthogonal transformation and invariance to isotropic scaling, and compare GBS with CKA. GBS shows state-of-the-art performance in reflecting the similarity and provides insights on explaining the adversarial sample behavior on the hidden layer space.

LGOct 28, 2021
RGP: Neural Network Pruning through Its Regular Graph Structure

Zhuangzhi Chen, Jingyang Xiang, Yao Lu et al.

Lightweight model design has become an important direction in the application of deep learning technology, pruning is an effective mean to achieve a large reduction in model parameters and FLOPs. The existing neural network pruning methods mostly start from the importance of parameters, and design parameter evaluation metrics to perform parameter pruning iteratively. These methods are not studied from the perspective of model topology, may be effective but not efficient, and requires completely different pruning for different datasets. In this paper, we study the graph structure of the neural network, and propose regular graph based pruning (RGP) to perform a one-shot neural network pruning. We generate a regular graph, set the node degree value of the graph to meet the pruning ratio, and reduce the average shortest path length of the graph by swapping the edges to obtain the optimal edge distribution. Finally, the obtained graph is mapped into a neural network structure to realize pruning. Experiments show that the average shortest path length of the graph is negatively correlated with the classification accuracy of the corresponding neural network, and the proposed RGP shows a strong precision retention capability with extremely high parameter reduction (more than 90%) and FLOPs reduction (more than 90%).

LGJul 9, 2021
GGT: Graph-Guided Testing for Adversarial Sample Detection of Deep Neural Network

Zuohui Chen, Renxuan Wang, Jingyang Xiang et al.

Deep Neural Networks (DNN) are known to be vulnerable to adversarial samples, the detection of which is crucial for the wide application of these DNN models. Recently, a number of deep testing methods in software engineering were proposed to find the vulnerability of DNN systems, and one of them, i.e., Model Mutation Testing (MMT), was used to successfully detect various adversarial samples generated by different kinds of adversarial attacks. However, the mutated models in MMT are always huge in number (e.g., over 100 models) and lack diversity (e.g., can be easily circumvented by high-confidence adversarial samples), which makes it less efficient in real applications and less effective in detecting high-confidence adversarial samples. In this study, we propose Graph-Guided Testing (GGT) for adversarial sample detection to overcome these aforementioned challenges. GGT generates pruned models with the guide of graph characteristics, each of them has only about 5% parameters of the mutated model in MMT, and graph guided models have higher diversity. The experiments on CIFAR10 and SVHN validate that GGT performs much better than MMT with respect to both effectiveness and efficiency.

LGJun 16, 2021
Adaptive Visibility Graph Neural Network and its Application in Modulation Classification

Qi Xuan, Kunfeng Qiu, Jinchao Zhou et al.

Our digital world is full of time series and graphs which capture the various aspects of many complex systems. Traditionally, there are respective methods in processing these two different types of data, e.g., Recurrent Neural Network (RNN) and Graph Neural Network (GNN), while in recent years, time series could be mapped to graphs by using the techniques such as Visibility Graph (VG), so that researchers can use graph algorithms to mine the knowledge in time series. Such mapping methods establish a bridge between time series and graphs, and have high potential to facilitate the analysis of various real-world time series. However, the VG method and its variants are just based on fixed rules and thus lack of flexibility, largely limiting their application in reality. In this paper, we propose an Adaptive Visibility Graph (AVG) algorithm that can adaptively map time series into graphs, based on which we further establish an end-to-end classification framework AVGNet, by utilizing GNN model DiffPool as the classifier. We then adopt AVGNet for radio signal modulation classification which is an important task in the field of wireless communication. The simulations validate that AVGNet outperforms a series of advanced deep learning methods, achieving the state-of-the-art performance in this task.

LGMar 1, 2021
CLPVG: Circular limited penetrable visibility graph as a new network model for time series

Qi Xuan, Jinchao Zhou, Kunfeng Qiu et al.

Visibility Graph (VG) transforms time series into graphs, facilitating signal processing by advanced graph data mining algorithms. In this paper, based on the classic Limited Penetrable Visibility Graph (LPVG) method, we propose a novel nonlinear mapping method named Circular Limited Penetrable Visibility Graph (CLPVG). The testing on degree distribution and clustering coefficient on the generated graphs of typical time series validates that our CLPVG is able to effectively capture the important features of time series and has better anti-noise ability than traditional LPVG. The experiments on real-world time-series datasets of radio signal and electroencephalogram (EEG) also suggest that the structural features provided by CLPVG, rather than LPVG, are more useful for time-series classification, leading to higher accuracy. And this classification performance can be further enhanced through structural feature expansion by adopting Subgraph Networks (SGN). All of these results validate the effectiveness of our CLPVG model.

SPOct 28, 2020
SigNet: A Novel Deep Learning Framework for Radio Signal Classification

Zhuangzhi Chen, Hui Cui, Jingyang Xiang et al.

Deep learning methods achieve great success in many areas due to their powerful feature extraction capabilities and end-to-end training mechanism, and recently they are also introduced for radio signal modulation classification. In this paper, we propose a novel deep learning framework called SigNet, where a signal-to-matrix (S2M) operator is adopted to convert the original signal into a square matrix first and is co-trained with a follow-up CNN architecture for classification. This model is further accelerated by integrating 1D convolution operators, leading to the upgraded model SigNet2.0. The simulations on two signal datasets show that both SigNet and SigNet2.0 outperform a number of well-known baselines. More interestingly, our proposed models behave extremely well in small-sample learning when only a small training dataset is provided. They can achieve a relatively high accuracy even when 1\% training data are kept, while other baseline models may lose their effectiveness much more quickly as the datasets get smaller. Such result suggests that SigNet/SigNet2.0 could be extremely useful in the situations where labeled signal data are difficult to obtain. The visualization of the output features of our models demonstrates that our model can well divide different modulation types of signals in the feature hyper-space.

SPSep 13, 2019
Spectrum Sensing Based on Deep Learning Classification for Cognitive Radios

Shilian Zheng, Shichuan Chen, Peihan Qi et al.

Spectrum sensing is a key technology for cognitive radios. We present spectrum sensing as a classification problem and propose a sensing method based on deep learning classification. We normalize the received signal power to overcome the effects of noise power uncertainty. We train the model with as many types of signals as possible as well as noise data to enable the trained network model to adapt to untrained new signals. We also use transfer learning strategies to improve the performance for real-world signals. Extensive experiments are conducted to evaluate the performance of this method. The simulation results show that the proposed method performs better than two traditional spectrum sensing methods, i.e., maximum-minimum eigenvalue ratio-based method and frequency domain entropy-based method. In addition, the experimental results of the new untrained signal types show that our method can adapt to the detection of these new signals. Furthermore, the real-world signal detection experiment results show that the detection performance can be further improved by transfer learning. Finally, experiments under colored noise show that our proposed method has superior detection performance under colored noise, while the traditional methods have a significant performance degradation, which further validate the superiority of our method.

CRJul 21, 2019
Open DNN Box by Power Side-Channel Attack

Yun Xiang, Zhuangzhi Chen, Zuohui Chen et al.

Deep neural networks are becoming popular and important assets of many AI companies. However, recent studies indicate that they are also vulnerable to adversarial attacks. Adversarial attacks can be either white-box or black-box. The white-box attacks assume full knowledge of the models while the black-box ones assume none. In general, revealing more internal information can enable much more powerful and efficient attacks. However, in most real-world applications, the internal information of embedded AI devices is unavailable, i.e., they are black-box. Therefore, in this work, we propose a side-channel information based technique to reveal the internal information of black-box models. Specifically, we have made the following contributions: (1) we are the first to use side-channel information to reveal internal network architecture in embedded devices; (2) we are the first to construct models for internal parameter estimation; and (3) we validate our methods on real-world devices and applications. The experimental results show that our method can achieve 96.50\% accuracy on average. Such results suggest that we should pay strong attention to the security problem of many AI applications, and further propose corresponding defensive strategies in the future.

LGApr 20, 2019
Deep Learning for Large-Scale Real-World ACARS and ADS-B Radio Signal Classification

Shichuan Chen, Shilian Zheng, Lifeng Yang et al.

Radio signal classification has a very wide range of applications in the field of wireless communications and electromagnetic spectrum management. In recent years, deep learning has been used to solve the problem of radio signal classification and has achieved good results. However, the radio signal data currently used is very limited in scale. In order to verify the performance of the deep learning-based radio signal classification on real-world radio signal data, in this paper we conduct experiments on large-scale real-world ACARS and ADS-B signal data with sample sizes of 900,000 and 13,000,000, respectively, and with categories of 3,143 and 5,157 respectively. We use the same Inception-Residual neural network model structure for ACARS signal classification and ADS-B signal classification to verify the ability of a single basic deep neural network model structure to process different types of radio signals, i.e., communication bursts in ACARS and pulse bursts in ADS-B. We build an experimental system for radio signal deep learning experiments. Experimental results show that the signal classification accuracy of ACARS and ADS-B is 98.1% and 96.3%, respectively. When the signal-to-noise ratio (with injected additive white Gaussian noise) is greater than 9 dB, the classification accuracy is greater than 92%. These experimental results validate the ability of deep learning to classify large-scale real-world radio signals. The results of the transfer learning experiment show that the model trained on large-scale ADS-B datasets is more conducive to the learning and training of new tasks than the model trained on small-scale datasets.