CRJan 7, 2024Code
Improving Transferability of Network Intrusion Detection in a Federated Learning SetupShreya Ghosh, Abu Shafin Mohammad Mahdee Jameel, Aly El Gamal
Network Intrusion Detection Systems (IDS) aim to detect the presence of an intruder by analyzing network packets arriving at an internet connected device. Data-driven deep learning systems, popular due to their superior performance compared to traditional IDS, depend on availability of high quality training data for diverse intrusion classes. A way to overcome this limitation is through transferable learning, where training for one intrusion class can lead to detection of unseen intrusion classes after deployment. In this paper, we provide a detailed study on the transferability of intrusion detection. We investigate practical federated learning configurations to enhance the transferability of intrusion detection. We propose two techniques to significantly improve the transferability of a federated intrusion detection system. The code for this work can be found at https://github.com/ghosh64/transferability.
CRAug 12, 2025Code
Developing a Transferable Federated Network Intrusion Detection SystemAbu Shafin Mohammad Mahdee Jameel, Shreya Ghosh, Aly El Gamal
Intrusion Detection Systems (IDS) are a vital part of a network-connected device. In this paper, we develop a deep learning based intrusion detection system that is deployed in a distributed setup across devices connected to a network. Our aim is to better equip deep learning models against unknown attacks using knowledge from known attacks. To this end, we develop algorithms to maximize the number of transferability relationships. We propose a Convolutional Neural Network (CNN) model, along with two algorithms that maximize the number of relationships observed. One is a two step data pre-processing stage, and the other is a Block-Based Smart Aggregation (BBSA) algorithm. The proposed system succeeds in achieving superior transferability performance while maintaining impressive local detection rates. We also show that our method is generalizable, exhibiting transferability potential across datasets and even with different backbones. The code for this work can be found at https://github.com/ghosh64/tabfidsv2.
LGAug 12, 2025Code
FetFIDS: A Feature Embedding Attention based Federated Network Intrusion Detection AlgorithmShreya Ghosh, Abu Shafin Mohammad Mahdee Jameel, Aly El Gamal
Intrusion Detection Systems (IDS) have an increasingly important role in preventing exploitation of network vulnerabilities by malicious actors. Recent deep learning based developments have resulted in significant improvements in the performance of IDS systems. In this paper, we present FetFIDS, where we explore the employment of feature embedding instead of positional embedding to improve intrusion detection performance of a transformer based deep learning system. Our model is developed with the aim of deployments in edge learning scenarios, where federated learning over multiple communication rounds can ensure both privacy and localized performance improvements. FetFIDS outperforms multiple state-of-the-art intrusion detection systems in a federated environment and demonstrates a high degree of suitability to federated learning. The code for this work can be found at https://github.com/ghosh64/fetfids.
CVFeb 19, 2025Code
A Racing Dataset and Baseline Model for Track Detection in Autonomous RacingShreya Ghosh, Yi-Huan Chen, Ching-Hsiang Huang et al.
A significant challenge in racing-related research is the lack of publicly available datasets containing raw images with corresponding annotations for the downstream task. In this paper, we introduce RoRaTrack, a novel dataset that contains annotated multi-camera image data from racing scenarios for track detection. The data is collected on a Dallara AV-21 at a racing circuit in Indiana, in collaboration with the Indy Autonomous Challenge (IAC). RoRaTrack addresses common problems such as blurriness due to high speed, color inversion from the camera, and absence of lane markings on the track. Consequently, we propose RaceGAN, a baseline model based on a Generative Adversarial Network (GAN) that effectively addresses these challenges. The proposed model demonstrates superior performance compared to current state-of-the-art machine learning models in track detection. The dataset and code for this work are available at https://github.com/ghosh64/RaceGAN.
CRDec 17, 2023Code
A Study on Transferability of Deep Learning Models for Network Intrusion DetectionShreya Ghosh, Abu Shafin Mohammad Mahdee Jameel, Aly El Gamal
In this paper, we explore transferability in learning between different attack classes in a network intrusion detection setup. We evaluate transferability of attack classes by training a deep learning model with a specific attack class and testing it on a separate attack class. We observe the effects of real and synthetically generated data augmentation techniques on transferability. We investigate the nature of observed transferability relationships, which can be either symmetric or asymmetric. We also examine explainability of the transferability relationships using the recursive feature elimination algorithm. We study data preprocessing techniques to boost model performance. The code for this work can be found at https://github.com/ghosh64/transferability.
SPJan 7, 2024
Deep OFDM Channel Estimation: Capturing Frequency RecurrenceAbu Shafin Mohammad Mahdee Jameel, Akshay Malhotra, Aly El Gamal et al.
In this paper, we propose a deep-learning-based channel estimation scheme in an orthogonal frequency division multiplexing (OFDM) system. Our proposed method, named Single Slot Recurrence Along Frequency Network (SisRafNet), is based on a novel study of recurrent models for exploiting sequential behavior of channels across frequencies. Utilizing the fact that wireless channels have a high degree of correlation across frequencies, we employ recurrent neural network techniques within a single OFDM slot, thus overcoming the latency and memory constraints typically associated with recurrence based methods. The proposed SisRafNet delivers superior estimation performance compared to existing deep-learning-based channel estimation techniques and the performance has been validated on a wide range of 3rd Generation Partnership Project (3GPP) compliant channel scenarios at multiple signal-to-noise ratios.
LGJan 7, 2024
Data-Driven Subsampling in the Presence of an Adversarial ActorAbu Shafin Mohammad Mahdee Jameel, Ahmed P. Mohamed, Jinho Yi et al.
Deep learning based automatic modulation classification (AMC) has received significant attention owing to its potential applications in both military and civilian use cases. Recently, data-driven subsampling techniques have been utilized to overcome the challenges associated with computational complexity and training time for AMC. Beyond these direct advantages of data-driven subsampling, these methods also have regularizing properties that may improve the adversarial robustness of the modulation classifier. In this paper, we investigate the effects of an adversarial attack on an AMC system that employs deep learning models both for AMC and for subsampling. Our analysis shows that subsampling itself is an effective deterrent to adversarial attacks. We also uncover the most efficient subsampling strategy when an adversarial attack on both the classifier and the subsampler is anticipated.
CVOct 22, 2021
CD&S Dataset: Handheld Imagery Dataset Acquired Under Field Conditions for Corn Disease Identification and Severity EstimationAanis Ahmad, Dharmendra Saraswat, Aly El Gamal et al.
Accurate disease identification and its severity estimation is an important consideration for disease management. Deep learning-based solutions for disease management using imagery datasets are being increasingly explored by the research community. However, most reported studies have relied on imagery datasets that were acquired under controlled lab conditions. As a result, such models lacked the ability to identify diseases in the field. Therefore, to train a robust deep learning model for field use, an imagery dataset was created using raw images acquired under field conditions using a handheld sensor and augmented images with varying backgrounds. The Corn Disease and Severity (CD&S) dataset consisted of 511, 524, and 562, field acquired raw images, corresponding to three common foliar corn diseases, namely Northern Leaf Blight (NLB), Gray Leaf Spot (GLS), and Northern Leaf Spot (NLS), respectively. For training disease identification models, half of the imagery data for each disease was annotated using bounding boxes and also used to generate 2343 additional images through augmentation using three different backgrounds. For severity estimation, an additional 515 raw images for NLS were acquired and categorized into severity classes ranging from 1 (resistant) to 5 (susceptible). Overall, the CD&S dataset consisted of 4455 total images comprising of 2112 field images and 2343 augmented images.
ROMay 4, 2021
Towards End-to-End Deep Learning for Autonomous Racing: On Data Collection and a Unified Architecture for Steering and Throttle PredictionShakti N. Wadekar, Benjamin J. Schwartz, Shyam S. Kannan et al.
Deep Neural Networks (DNNs) which are trained end-to-end have been successfully applied to solve complex problems that we have not been able to solve in past decades. Autonomous driving is one of the most complex problems which is yet to be completely solved and autonomous racing adds more complexity and exciting challenges to this problem. Towards the challenge of applying end-to-end learning to autonomous racing, this paper shows results on two aspects: (1) Analyzing the relationship between the driving data used for training and the maximum speed at which the DNN can be successfully applied for predicting steering angle, (2) Neural network architecture and training methodology for learning steering and throttle without any feedback or recurrent connections.
CRApr 3, 2021
Mitigating Gradient-based Adversarial Attacks via Denoising and CompressionRehana Mahfuz, Rajeev Sahay, Aly El Gamal
Gradient-based adversarial attacks on deep neural networks pose a serious threat, since they can be deployed by adding imperceptible perturbations to the test data of any network, and the risk they introduce cannot be assessed through the network's original training performance. Denoising and dimensionality reduction are two distinct methods that have been independently investigated to combat such attacks. While denoising offers the ability to tailor the defense to the specific nature of the attack, dimensionality reduction offers the advantage of potentially removing previously unseen perturbations, along with reducing the training time of the network being defended. We propose strategies to combine the advantages of these two defense mechanisms. First, we propose the cascaded defense, which involves denoising followed by dimensionality reduction. To reduce the training time of the defense for a small trade-off in performance, we propose the hidden layer defense, which involves feeding the output of the encoder of a denoising autoencoder into the network. Further, we discuss how adaptive attacks against these defenses could become significantly weak when an alternative defense is used, or when no defense is used. In this light, we propose a new metric to evaluate a defense which measures the sensitivity of the adaptive attack to modifications in the defense. Finally, we present a guideline for building an ordered repertoire of defenses, a.k.a. a defense infrastructure, that adjusts to limited computational resources in presence of uncertainty about the attack strategy.
NIApr 3, 2021
Gradient-based Adversarial Deep Modulation Classification with Data-driven SubsamplingJinho Yi, Aly El Gamal
Automatic modulation classification can be a core component for intelligent spectrally efficient wireless communication networks, and deep learning techniques have recently been shown to deliver superior performance to conventional model-based strategies, particularly when distinguishing between a large number of modulation types. However, such deep learning techniques have also been recently shown to be vulnerable to gradient-based adversarial attacks that rely on subtle input perturbations, which would be particularly feasible in a wireless setting via jamming. One such potent attack is the one known as the Carlini-Wagner attack, which we consider in this work. We further consider a data-driven subsampling setting, where several recently introduced deep-learning-based algorithms are employed to select a subset of samples that lead to reducing the final classifier's training time with minimal loss in accuracy. In this setting, the attacker has to make an assumption about the employed subsampling strategy, in order to calculate the loss gradient. Based on state of the art techniques available to both the attacker and defender, we evaluate best strategies under various assumptions on the knowledge of the other party's strategy. Interestingly, in presence of knowledgeable attackers, we identify computational cost reduction opportunities for the defender with no or minimal loss in performance.
NIApr 3, 2021
Knowledge Distillation For Wireless Edge LearningAhmed P. Mohamed, Abu Shafin Mohammad Mahdee Jameel, Aly El Gamal
In this paper, we propose a framework for predicting frame errors in the collaborative spectrally congested wireless environments of the DARPA Spectrum Collaboration Challenge (SC2) via a recently collected dataset. We employ distributed deep edge learning that is shared among edge nodes and a central cloud. Using this close-to-practice dataset, we find that widely used federated learning approaches, specially those that are privacy preserving, are worse than local training for a wide range of settings. We hence utilize the synthetic minority oversampling technique to maintain privacy via avoiding the transfer of local data to the cloud, and utilize knowledge distillation with an aim to benefit from high cloud computing and storage capabilities. The proposed framework achieves overall better performance than both local and federated training approaches, while being robust against catastrophic failures as well as challenging channel conditions that result in high frame error rates.
ITFeb 9, 2021
A Provably Convergent Information Bottleneck Solution via ADMMTeng-Hui Huang, Aly El Gamal
The Information bottleneck (IB) method enables optimizing over the trade-off between compression of data and prediction accuracy of learned representations, and has successfully and robustly been applied to both supervised and unsupervised representation learning problems. However, IB has several limitations. First, the IB problem is hard to optimize. The IB Lagrangian $\mathcal{L}_{IB}:=I(X;Z)-βI(Y;Z)$ is non-convex and existing solutions guarantee only local convergence. As a result, the obtained solutions depend on initialization. Second, the evaluation of a solution is also a challenging task. Conventionally, it resorts to characterizing the information plane, that is, plotting $I(Y;Z)$ versus $I(X;Z)$ for all solutions obtained from different initial points. Furthermore, the IB Lagrangian has phase transitions while varying the multiplier $β$. At phase transitions, both $I(X;Z)$ and $I(Y;Z)$ increase abruptly and the rate of convergence becomes significantly slow for existing solutions. Recent works with IB adopt variational surrogate bounds to the IB Lagrangian. Although allowing efficient optimization, how close are these surrogates to the IB Lagrangian is not clear. In this work, we solve the IB Lagrangian using augmented Lagrangian methods. With augmented variables, we show that the IB objective can be solved with the alternating direction method of multipliers (ADMM). Different from prior works, we prove that the proposed algorithm is consistently convergent, regardless of the value of $β$. Empirically, our gradient-descent-based method results in information plane points that are comparable to those obtained through the conventional Blahut-Arimoto-based solvers and is convergent for a wider range of the penalty coefficient than previous ADMM solvers.
CVDec 2, 2020
Data-driven Analysis of Turbulent Flame ImagesRathziel Roncancio, Jupyoung Kim, Aly El Gamal et al.
Turbulent premixed flames are important for power generation using gas turbines. Improvements in characterization and understanding of turbulent flames continue particularly for transient events like ignition and extinction. Pockets or islands of unburned material are features of turbulent flames during these events. These features are directly linked to heat release rates and hydrocarbons emissions. Unburned material pockets in turbulent CH$_4$/air premixed flames with CO$_2$ addition were investigated using OH Planar Laser-Induced Fluorescence images. Convolutional Neural Networks (CNN) were used to classify images containing unburned pockets for three turbulent flames with 0%, 5%, and 10% CO$_2$ addition. The CNN model was constructed using three convolutional layers and two fully connected layers using dropout and weight decay. The CNN model achieved accuracies of 91.72%, 89.35% and 85.80% for the three flames, respectively.
SPJul 27, 2020
Deep Learning for DOA Estimation in MIMO Radar Systems via Emulation of Large Antenna ArraysAya Mostafa Ahmed, Udaya Sampath K. P. Miriya Thanthrige, Aly El Gamal et al.
We present a MUSIC-based Direction of Arrival (DOA) estimation strategy using small antenna arrays, via employing deep learning for reconstructing the signals of a virtual large antenna array. Not only does the proposed strategy deliver significantly better performance than simply plugging the incoming signals into MUSIC, but surprisingly, the performance is also better than directly using an actual large antenna array with MUSIC for high angle ranges and low test SNR values. We further analyze the best choice for the training SNR as a function of the test SNR, and observe dramatic changes in the behavior of this function for different angle ranges.
SPMay 10, 2020
Ensemble Wrapper Subsampling for Deep Modulation ClassificationSharan Ramjee, Shengtai Ju, Diyu Yang et al.
Subsampling of received wireless signals is important for relaxing hardware requirements as well as the computational cost of signal processing algorithms that rely on the output samples. We propose a subsampling technique to facilitate the use of deep learning for automatic modulation classification in wireless communication systems. Unlike traditional approaches that rely on pre-designed strategies that are solely based on expert knowledge, the proposed data-driven subsampling strategy employs deep neural network architectures to simulate the effect of removing candidate combinations of samples from each training input vector, in a manner inspired by how wrapper feature selection models work. The subsampled data is then processed by another deep learning classifier that recognizes each of the considered 10 modulation types. We show that the proposed subsampling strategy not only introduces drastic reduction in the classifier training time, but can also improve the classification accuracy to higher levels than those reached before for the considered dataset. An important feature herein is exploiting the transferability property of deep neural networks to avoid retraining the wrapper models and obtain superior performance through an ensemble of wrappers over that possible through solely relying on any of them.
SPMar 22, 2020
Deep Learning for Frame Error Prediction using a DARPA Spectrum Collaboration Challenge (SC2) DatasetAbu Shafin Mohammad Mahdee Jameel, Ahmed P. Mohamed, Xiwen Zhang et al.
We demonstrate a first example for employing deep learning in predicting frame errors for a Collaborative Intelligent Radio Network (CIRN) using a dataset collected during participation in the final scrimmages of the DARPA SC2 challenge. Four scenarios are considered based on randomizing or fixing the strategy for bandwidth and channel allocation, and either training and testing with different links or using a pilot phase for each link to train the deep neural network. We also investigate the effect of latency constraints, and uncover interesting characteristics of the predictor over different Signal to Noise Ratio (SNR) ranges. The obtained insights open the door for implementing a deep-learning-based strategy that is scalable to large heterogeneous networks, generalizable to diverse wireless environments, and suitable for predicting frame error instances and rates within a congested shared spectrum.
LGFeb 22, 2020
Non-Intrusive Detection of Adversarial Deep Learning Attacks via Observer NetworksKirthi Shankar Sivamani, Rajeev Sahay, Aly El Gamal
Recent studies have shown that deep learning models are vulnerable to specifically crafted adversarial inputs that are quasi-imperceptible to humans. In this letter, we propose a novel method to detect adversarial inputs, by augmenting the main classification network with multiple binary detectors (observer networks) which take inputs from the hidden layers of the original network (convolutional kernel outputs) and classify the input as clean or adversarial. During inference, the detectors are treated as a part of an ensemble network and the input is deemed adversarial if at least half of the detectors classify it as so. The proposed method addresses the trade-off between accuracy of classification on clean and adversarial samples, as the original classification network is not modified during the detection process. The use of multiple observer networks makes attacking the detection mechanism non-trivial even when the attacker is aware of the victim classifier. We achieve a 99.5% detection accuracy on the MNIST dataset and 97.5% on the CIFAR-10 dataset using the Fast Gradient Sign Attack in a semi-white box setup. The number of false positive detections is a mere 0.12% in the worst case scenario.
LGJan 26, 2020
Ensemble Noise Simulation to Handle Uncertainty about Gradient-based Adversarial AttacksRehana Mahfuz, Rajeev Sahay, Aly El Gamal
Gradient-based adversarial attacks on neural networks can be crafted in a variety of ways by varying either how the attack algorithm relies on the gradient, the network architecture used for crafting the attack, or both. Most recent work has focused on defending classifiers in a case where there is no uncertainty about the attacker's behavior (i.e., the attacker is expected to generate a specific attack using a specific network architecture). However, if the attacker is not guaranteed to behave in a certain way, the literature lacks methods in devising a strategic defense. We fill this gap by simulating the attacker's noisy perturbation using a variety of attack algorithms based on gradients of various classifiers. We perform our analysis using a pre-processing Denoising Autoencoder (DAE) defense that is trained with the simulated noise. We demonstrate significant improvements in post-attack accuracy, using our proposed ensemble-trained defense, compared to a situation where no effort is made to handle uncertainty.
LGDec 26, 2019
Efficient Training of Deep Classifiers for Wireless Source Identification using Test SNR EstimatesXingchen Wang, Shengtai Ju, Xiwen Zhang et al.
We study efficient deep learning training algorithms that process received wireless signals, if a test Signal to Noise Ratio (SNR) estimate is available. We focus on two tasks that facilitate source identification: 1- Identifying the modulation type, 2- Identifying the wireless technology and channel in the 2.4 GHz ISM band. For benchmarking, we rely on recent literature on testing deep learning algorithms against two well-known datasets. We first demonstrate that using training data corresponding only to the test SNR value leads to dramatic reductions in training time while incurring a small loss in average test accuracy, as it improves the accuracy for low SNR values. Further, we show that an erroneous test SNR estimate with a small positive offset is better for training than another having the same error magnitude with a negative offset. Secondly, we introduce a greedy training SNR Boosting algorithm that leads to uniform improvement in accuracy across all tested SNR values, while using a small subset of training SNR values at each test SNR. Finally, we demonstrate the potential of bootstrap aggregating (Bagging) based on training SNR values to improve generalization at low test SNR values with scarcity of training data.
CRDec 25, 2019
Grand Challenges in Resilience: Autonomous System Resilience through Design and Runtime MeasuresSaurabh Bagchi, Vaneet Aggarwal, Somali Chaterji et al.
A set of about 80 researchers, practitioners, and federal agency program managers participated in the NSF-sponsored Grand Challenges in Resilience Workshop held on Purdue campus on March 19-21, 2019. The workshop was divided into three themes: resilience in cyber, cyber-physical, and socio-technical systems. About 30 attendees in all participated in the discussions of cyber resilience. This article brings out the substantive parts of the challenges and solution approaches that were identified in the cyber resilience theme. In this article, we put forward the substantial challenges in cyber resilience in a few representative application domains and outline foundational solutions to address these challenges. These solutions fall into two broad themes: resilience-by-design and resilience-by-reaction. We use examples of autonomous systems as the application drivers motivating cyber resilience. We focus on some autonomous systems in the near horizon (autonomous ground and aerial vehicles) and also a little more distant (autonomous rescue and relief). For resilience-by-design, we focus on design methods in software that are needed for our cyber systems to be resilient. In contrast, for resilience-by-reaction, we discuss how to make systems resilient by responding, reconfiguring, or recovering at runtime when failures happen. We also discuss the notion of adaptive execution to improve resilience, execution transparently and adaptively among available execution platforms (mobile/embedded, edge, and cloud). For each of the two themes, we survey the current state, and the desired state and ways to get there. We conclude the paper by looking at the research challenges we will have to solve in the short and the mid-term to make the vision of resilient autonomous systems a reality.
LGJun 13, 2019
A Computationally Efficient Method for Defending Adversarial Deep Learning AttacksRajeev Sahay, Rehana Mahfuz, Aly El Gamal
The reliance on deep learning algorithms has grown significantly in recent years. Yet, these models are highly vulnerable to adversarial attacks, which introduce visually imperceptible perturbations into testing data to induce misclassifications. The literature has proposed several methods to combat such adversarial attacks, but each method either fails at high perturbation values, requires excessive computing power, or both. This letter proposes a computationally efficient method for defending the Fast Gradient Sign (FGS) adversarial attack by simultaneously denoising and compressing data. Specifically, our proposed defense relies on training a fully connected multi-layer Denoising Autoencoder (DAE) and using its encoder as a defense against the adversarial attack. Our results show that using this dimensionality reduction scheme is not only highly effective in mitigating the effect of the FGS attack in multiple threat models, but it also provides a 2.43x speedup in comparison to defense strategies providing similar robustness against the same attack.
LGMay 28, 2019
Efficient Wrapper Feature Selection using Autoencoder and Model Based EliminationSharan Ramjee, Aly El Gamal
We propose a computationally efficient wrapper feature selection method - called Autoencoder and Model Based Elimination of features using Relevance and Redundancy scores (AMBER) - that uses a single ranker model along with autoencoders to perform greedy backward elimination of features. The ranker model is used to prioritize the removal of features that are not critical to the classification task, while the autoencoders are used to prioritize the elimination of correlated features. We demonstrate the superior feature selection ability of AMBER on 4 well known datasets corresponding to different domain applications via comparing the classification accuracies with other computationally efficient state-of-the-art feature selection techniques. Interestingly, we find that the ranker model that is used for feature selection does not necessarily have to be the same as the final classifier that is trained on the selected features. Finally, we note how a smaller number of features can lead to higher accuracies on some datasets, and hypothesize that overfitting the ranker model on the training set facilitates the selection of more salient features.
SPMay 16, 2019
Deep Learning for Interference Identification: Band, Training SNR, and Sample SelectionXiwen Zhang, Tolunay Seyfi, Shengtai Ju et al.
We study the problem of interference source identification, through the lens of recognizing one of 15 different channels that belong to 3 different wireless technologies: Bluetooth, Zigbee, and WiFi. We employ deep learning algorithms trained on received samples taken from a 10 MHz band in the 2.4 GHz ISM Band. We obtain a classification accuracy of around 89.5% using any of four different deep neural network architectures: CNN, ResNet, CLDNN, and LSTM, which demonstrate the generality of the effectiveness of deep learning at the considered task. Interestingly, our proposed CNN architecture requires approximately 60% of the training time required by the state of the art while achieving slightly larger classification accuracy. We then focus on the CNN architecture and further optimize its training time while incurring minimal loss in classification accuracy using three different approaches: 1- Band Selection, where we only use samples belonging to the lower and uppermost 2 MHz bands, 2- SNR Selection, where we only use training samples belonging to a single SNR value, and 3- Sample Selection, where we try various sub-Nyquist sampling methods to select the subset of samples most relevant to the classification task. Our results confirm the feasibility of fast deep learning for wireless interference identification, by showing that the training time can be reduced by as much as 30x with minimal loss in accuracy.
SPJan 16, 2019
Fast Deep Learning for Automatic Modulation ClassificationSharan Ramjee, Shengtai Ju, Diyu Yang et al.
In this work, we investigate the feasibility and effectiveness of employing deep learning algorithms for automatic recognition of the modulation type of received wireless communication signals from subsampled data. Recent work considered a GNU radio-based data set that mimics the imperfections in a real wireless channel and uses 10 different modulation types. A Convolutional Neural Network (CNN) architecture was then developed and shown to achieve performance that exceeds that of expert-based approaches. Here, we continue this line of work and investigate deep neural network architectures that deliver high classification accuracy. We identify three architectures - namely, a Convolutional Long Short-term Deep Neural Network (CLDNN), a Long Short-Term Memory neural network (LSTM), and a deep Residual Network (ResNet) - that lead to typical classification accuracy values around 90% at high SNR. We then study algorithms to reduce the training time by minimizing the size of the training data set, while incurring a minimal loss in classification accuracy. To this end, we demonstrate the performance of Principal Component Analysis in significantly reducing the training time, while maintaining good performance at low SNR. We also investigate subsampling techniques that further reduce the training time, and pave the way for online classification at high SNR. Finally, we identify representative SNR values for training each of the candidate architectures, and consequently, realize drastic reductions of the training time, with negligible loss in classification accuracy.
LGDec 7, 2018
Combatting Adversarial Attacks through Denoising and Dimensionality Reduction: A Cascaded Autoencoder ApproachRajeev Sahay, Rehana Mahfuz, Aly El Gamal
Machine Learning models are vulnerable to adversarial attacks that rely on perturbing the input data. This work proposes a novel strategy using Autoencoder Deep Neural Networks to defend a machine learning model against two gradient-based attacks: The Fast Gradient Sign attack and Fast Gradient attack. First we use an autoencoder to denoise the test data, which is trained with both clean and corrupted data. Then, we reduce the dimension of the denoised data using the hidden layer representation of another autoencoder. We perform this experiment for multiple values of the bound of adversarial perturbations, and consider different numbers of reduced dimensions. When the test data is preprocessed using this cascaded pipeline, the tested deep neural network classifier yields a much higher accuracy, thus mitigating the effect of the adversarial perturbation.
LGDec 1, 2017
Deep Neural Network Architectures for Modulation ClassificationXiaoyu Liu, Diyu Yang, Aly El Gamal
In this work, we investigate the value of employing deep learning for the task of wireless signal modulation recognition. Recently in [1], a framework has been introduced by generating a dataset using GNU radio that mimics the imperfections in a real wireless channel, and uses 10 different modulation types. Further, a convolutional neural network (CNN) architecture was developed and shown to deliver performance that exceeds that of expert-based approaches. Here, we follow the framework of [1] and find deep neural network architectures that deliver higher accuracy than the state of the art. We tested the architecture of [1] and found it to achieve an accuracy of approximately 75% of correctly recognizing the modulation type. We first tune the CNN architecture of [1] and find a design with four convolutional layers and two dense layers that gives an accuracy of approximately 83.8% at high SNR. We then develop architectures based on the recently introduced ideas of Residual Networks (ResNet [2]) and Densely Connected Networks (DenseNet [3]) to achieve high SNR accuracies of approximately 83.5% and 86.6%, respectively. Finally, we introduce a Convolutional Long Short-term Deep Neural Network (CLDNN [4]) to achieve an accuracy of approximately 88.5% at high SNR.
LGMay 26, 2017
A Sampling Theory Perspective of Graph-based Semi-supervised LearningAamir Anis, Aly El Gamal, Salman Avestimehr et al.
Graph-based methods have been quite successful in solving unsupervised and semi-supervised learning problems, as they provide a means to capture the underlying geometry of the dataset. It is often desirable for the constructed graph to satisfy two properties: first, data points that are similar in the feature space should be strongly connected on the graph, and second, the class label information should vary smoothly with respect to the graph, where smoothness is measured using the spectral properties of the graph Laplacian matrix. Recent works have justified some of these smoothness conditions by showing that they are strongly linked to the semi-supervised smoothness assumption and its variants. In this work, we reinforce this connection by viewing the problem from a graph sampling theoretic perspective, where class indicator functions are treated as bandlimited graph signals (in the eigenvector basis of the graph Laplacian) and label prediction as a bandlimited reconstruction problem. Our approach involves analyzing the bandwidth of class indicator signals generated from statistical data models with separable and nonseparable classes. These models are quite general and mimic the nature of most real-world datasets. Our results show that in the asymptotic limit, the bandwidth of any class indicator is also closely related to the geometry of the dataset. This allows one to theoretically justify the assumption of bandlimitedness of class indicator signals, thereby providing a sampling theoretic interpretation of graph-based semi-supervised classification.
LGFeb 14, 2015
Asymptotic Justification of Bandlimited Interpolation of Graph signals for Semi-Supervised LearningAamir Anis, Aly El Gamal, A. Salman Avestimehr et al.
Graph-based methods play an important role in unsupervised and semi-supervised learning tasks by taking into account the underlying geometry of the data set. In this paper, we consider a statistical setting for semi-supervised learning and provide a formal justification of the recently introduced framework of bandlimited interpolation of graph signals. Our analysis leads to the interpretation that, given enough labeled data, this method is very closely related to a constrained low density separation problem as the number of data points tends to infinity. We demonstrate the practical utility of our results through simple experiments.