IVMay 27, 2022
Calibrated Bagging Deep Learning for Image Semantic Segmentation: A Case Study on COVID-19 Chest X-ray ImageLucy Nwosu, Xiangfang Li, Lijun Qian et al.
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) causes coronavirus disease 2019 (COVID-19). Imaging tests such as chest X-ray (CXR) and computed tomography (CT) can provide useful information to clinical staff for facilitating a diagnosis of COVID-19 in a more efficient and comprehensive manner. As a breakthrough of artificial intelligence (AI), deep learning has been applied to perform COVID-19 infection region segmentation and disease classification by analyzing CXR and CT data. However, prediction uncertainty of deep learning models for these tasks, which is very important to safety-critical applications like medical image processing, has not been comprehensively investigated. In this work, we propose a novel ensemble deep learning model through integrating bagging deep learning and model calibration to not only enhance segmentation performance, but also reduce prediction uncertainty. The proposed method has been validated on a large dataset that is associated with CXR image segmentation. Experimental results demonstrate that the proposed method can improve the segmentation performance, as well as decrease prediction uncertainties.
CLJun 10, 2023
Medical Data Augmentation via ChatGPT: A Case Study on Medication Identification and Medication Event ClassificationShouvon Sarker, Lijun Qian, Xishuang Dong
The identification of key factors such as medications, diseases, and relationships within electronic health records and clinical notes has a wide range of applications in the clinical field. In the N2C2 2022 competitions, various tasks were presented to promote the identification of key factors in electronic health records (EHRs) using the Contextualized Medication Event Dataset (CMED). Pretrained large language models (LLMs) demonstrated exceptional performance in these tasks. This study aims to explore the utilization of LLMs, specifically ChatGPT, for data augmentation to overcome the limited availability of annotated data for identifying the key factors in EHRs. Additionally, different pre-trained BERT models, initially trained on extensive datasets like Wikipedia and MIMIC, were employed to develop models for identifying these key variables in EHRs through fine-tuning on augmented datasets. The experimental results of two EHR analysis tasks, namely medication identification and medication event classification, indicate that data augmentation based on ChatGPT proves beneficial in improving performance for both medication identification and medication event classification.
LGMay 15, 2022
Effect of Batch Normalization on Noise Resistant Property of Deep Learning ModelsOmobayode Fagbohungbe, Lijun Qian
The fast execution speed and energy efficiency of analog hardware has made them a strong contender for deployment of deep learning model at the edge. However, there are concerns about the presence of analog noise which causes changes to the weight of the models, leading to performance degradation of deep learning model, despite their inherent noise resistant characteristics. The effect of the popular batch normalization layer on the noise resistant ability of deep learning model is investigated in this work. This systematic study has been carried out by first training different models with and without batch normalization layer on CIFAR10 and CIFAR100 dataset. The weights of the resulting models are then injected with analog noise and the performance of the models on the test dataset is obtained and compared. The results show that the presence of batch normalization layer negatively impacts noise resistant property of deep learning model and the impact grows with the increase of the number of batch normalization layers.
CVJan 11, 2023
Inverse Quantum Fourier Transform Inspired Algorithm for Unsupervised Image SegmentationTaoreed Akinola, Xiangfang Li, Richard Wilkins et al.
Image segmentation is a very popular and important task in computer vision. In this paper, inverse quantum Fourier transform (IQFT) for image segmentation has been explored and a novel IQFT-inspired algorithm is proposed and implemented by leveraging the underlying mathematical structure of the IQFT. Specifically, the proposed method takes advantage of the phase information of the pixels in the image by encoding the pixels' intensity into qubit relative phases and applying IQFT to classify the pixels into different segments automatically and efficiently. To the best of our knowledge, this is the first attempt of using IQFT for unsupervised image segmentation. The proposed method has low computational cost comparing to the deep learning-based methods and more importantly it does not require training, thus make it suitable for real-time applications. The performance of the proposed method is compared with K-means and Otsu-thresholding. The proposed method outperforms both of them on the PASCAL VOC 2012 segmentation benchmark and the xVIEW2 challenge dataset by as much as 50% in terms of mean Intersection-Over-Union (mIOU).
LGMay 8, 2022
Impact of Learning Rate on Noise Resistant Property of Deep Learning ModelsOmobayode Fagbohungbe, Lijun Qian
The interest in analog computation has grown tremendously in recent years due to its fast computation speed and excellent energy efficiency, which is very important for edge and IoT devices in the sub-watt power envelope for deep learning inferencing. However, significant performance degradation suffered by deep learning models due to the inherent noise present in the analog computation can limit their use in mission-critical applications. Hence, there is a need to understand the impact of critical model hyperparameters choice on the resulting model noise-resistant property. This need is critical as the insight obtained can be used to design deep learning models that are robust to analog noise. In this paper, the impact of the learning rate, a critical design choice, on the noise-resistant property is investigated. The study is achieved by first training deep learning models using different learning rates. Thereafter, the models are injected with analog noise and the noise-resistant property of the resulting models is examined by measuring the performance degradation due to the analog noise. The results showed there exists a sweet spot of learning rate values that achieves a good balance between model prediction performance and model noise-resistant property. Furthermore, the theoretical justification of the observed phenomenon is provided.
LGNov 3, 2025
Transmitter Identification and Protocol Categorization in Shared Spectrum via Multi-Task RF Classification at the Network EdgeTariq Abdul-Quddoos, Tasnia Sharmin, Xiangfang Li et al.
As spectrum sharing becomes increasingly vital to meet rising wireless demands in the future, spectrum monitoring and transmitter identification are indispensable for enforcing spectrum usage policy, efficient spectrum utilization, and network security. This study proposed a robust framework for transmitter identification and protocol categorization via multi-task RF signal classification in shared spectrum environments, where the spectrum monitor will classify transmission protocols (e.g., 4G LTE, 5G-NR, IEEE 802.11a) operating within the same frequency bands, and identify different transmitting base stations, as well as their combinations. A Convolutional Neural Network (CNN) is designed to tackle critical challenges such as overlapping signal characteristics and environmental variability. The proposed method employs a multi-channel input strategy to extract meaningful signal features, achieving remarkable accuracy: 90% for protocol classification, 100% for transmitting base station classification, and 92% for joint classification tasks, utilizing RF data from the POWDER platform. These results highlight the significant potential of the proposed method to enhance spectrum monitoring, management, and security in modern wireless networks.
LGMay 7, 2022
Impact of L1 Batch Normalization on Analog Noise Resistant Property of Deep Learning ModelsOmobayode Fagbohungbe, Lijun Qian
Analog hardware has become a popular choice for machine learning on resource-constrained devices recently due to its fast execution and energy efficiency. However, the inherent presence of noise in analog hardware and the negative impact of the noise on deployed deep neural network (DNN) models limit their usage. The degradation in performance due to the noise calls for the novel design of DNN models that have excellent noiseresistant property, leveraging the properties of the fundamental building block of DNN models. In this work, the use of L1 or TopK BatchNorm type, a fundamental DNN model building block, in designing DNN models with excellent noise-resistant property is proposed. Specifically, a systematic study has been carried out by training DNN models with L1/TopK BatchNorm type, and the performance is compared with DNN models with L2 BatchNorm types. The resulting model noise-resistant property is tested by injecting additive noise to the model weights and evaluating the new model inference accuracy due to the noise. The results show that L1 and TopK BatchNorm type has excellent noise-resistant property, and there is no sacrifice in performance due to the change in the BatchNorm type from L2 to L1/TopK BatchNorm type.
LGDec 19, 2023
Comprehensive Validation on Reweighting Samples for Bias Mitigation via AIF360Christina Hastings Blow, Lijun Qian, Camille Gibson et al.
Fairness AI aims to detect and alleviate bias across the entire AI development life cycle, encompassing data curation, modeling, evaluation, and deployment-a pivotal aspect of ethical AI implementation. Addressing data bias, particularly concerning sensitive attributes like gender and race, reweighting samples proves efficient for fairness AI. This paper contributes a systematic examination of reweighting samples for traditional machine learning (ML) models, employing five models for binary classification on the Adult Income and COMPUS datasets with various protected attributes. The study evaluates prediction results using five fairness metrics, uncovering the nuanced and model-specific nature of reweighting sample effectiveness in achieving fairness in traditional ML models, as well as revealing the complexity of bias dynamics.
LGOct 20, 2024
Data Augmentation via Diffusion Model to Enhance AI FairnessChristina Hastings Blow, Lijun Qian, Camille Gibson et al.
AI fairness seeks to improve the transparency and explainability of AI systems by ensuring that their outcomes genuinely reflect the best interests of users. Data augmentation, which involves generating synthetic data from existing datasets, has gained significant attention as a solution to data scarcity. In particular, diffusion models have become a powerful technique for generating synthetic data, especially in fields like computer vision. This paper explores the potential of diffusion models to generate synthetic tabular data to improve AI fairness. The Tabular Denoising Diffusion Probabilistic Model (Tab-DDPM), a diffusion model adaptable to any tabular dataset and capable of handling various feature types, was utilized with different amounts of generated data for data augmentation. Additionally, reweighting samples from AIF360 was employed to further enhance AI fairness. Five traditional machine learning models-Decision Tree (DT), Gaussian Naive Bayes (GNB), K-Nearest Neighbors (KNN), Logistic Regression (LR), and Random Forest (RF)-were used to validate the proposed approach. Experimental results demonstrate that the synthetic data generated by Tab-DDPM improves fairness in binary classification.
CYApr 25, 2024
Enhancing Deep Knowledge Tracing via Diffusion Models for Personalized Adaptive LearningMing Kuo, Shouvon Sarker, Lijun Qian et al.
In contrast to pedagogies like evidence-based teaching, personalized adaptive learning (PAL) distinguishes itself by closely monitoring the progress of individual students and tailoring the learning path to their unique knowledge and requirements. A crucial technique for effective PAL implementation is knowledge tracing, which models students' evolving knowledge to predict their future performance. Based on these predictions, personalized recommendations for resources and learning paths can be made to meet individual needs. Recent advancements in deep learning have successfully enhanced knowledge tracking through Deep Knowledge Tracing (DKT). This paper introduces generative AI models to further enhance DKT. Generative AI models, rooted in deep learning, are trained to generate synthetic data, addressing data scarcity challenges in various applications across fields such as natural language processing (NLP) and computer vision (CV). This study aims to tackle data shortage issues in student learning records to enhance DKT performance for PAL. Specifically, it employs TabDDPM, a diffusion model, to generate synthetic educational records to augment training data for enhancing DKT. The proposed method's effectiveness is validated through extensive experiments on ASSISTments datasets. The experimental results demonstrate that the AI-generated data by TabDDPM significantly improves DKT performance, particularly in scenarios with small data for training and large data for testing.
LGNov 17, 2025
Multi-task GINN-LP for Multi-target Symbolic RegressionHussein Rajabu, Lijun Qian, Xishuang Dong
In the area of explainable artificial intelligence, Symbolic Regression (SR) has emerged as a promising approach by discovering interpretable mathematical expressions that fit data. However, SR faces two main challenges: most methods are evaluated on scientific datasets with well-understood relationships, limiting generalization, and SR primarily targets single-output regression, whereas many real-world problems involve multi-target outputs with interdependent variables. To address these issues, we propose multi-task regression GINN-LP (MTRGINN-LP), an interpretable neural network for multi-target symbolic regression. By integrating GINN-LP with a multi-task deep learning, the model combines a shared backbone including multiple power-term approximator blocks with task-specific output layers, capturing inter-target dependencies while preserving interpretability. We validate multi-task GINN-LP on practical multi-target applications, including energy efficiency prediction and sustainable agriculture. Experimental results demonstrate competitive predictive performance alongside high interpretability, effectively extending symbolic regression to broader real-world multi-output tasks.
CLSep 23, 2025
Systematic Comparative Analysis of Large Pretrained Language Models on Contextualized Medication Event ExtractionTariq Abdul-Quddoos, Xishuang Dong, Lijun Qian
Attention-based models have become the leading approach in modeling medical language for Natural Language Processing (NLP) in clinical notes. These models outperform traditional techniques by effectively capturing contextual representations of language. In this research a comparative analysis is done amongst pre-trained attention based models namely Bert Base, BioBert, two variations of Bio+Clinical Bert, RoBerta, and Clinical Longformer on task related to Electronic Health Record (EHR) information extraction. The tasks from Track 1 of Harvard Medical School's 2022 National Clinical NLP Challenges (n2c2) are considered for this comparison, with the Contextualized Medication Event Dataset (CMED) given for these task. CMED is a dataset of unstructured EHRs and annotated notes that contain task relevant information about the EHRs. The goal of the challenge is to develop effective solutions for extracting contextual information related to patient medication events from EHRs using data driven methods. Each pre-trained model is fine-tuned and applied on CMED to perform medication extraction, medical event detection, and multi-dimensional medication event context classification. Processing methods are also detailed for breaking down EHRs for compatibility with the applied models. Performance analysis has been carried out using a script based on constructing medical terms from the evaluation portion of CMED with metrics including recall, precision, and F1-Score. The results demonstrate that models pre-trained on clinical data are more effective in detecting medication and medication events, but Bert Base, pre-trained on general domain data showed to be the most effective for classifying the context of events related to medications.
IRJul 5, 2025
Enhancing Learning Path Recommendation via Multi-task LearningAfsana Nasrin, Lijun Qian, Pamela Obiomon et al.
Personalized learning is a student-centered educational approach that adapts content, pace, and assessment to meet each learner's unique needs. As the key technique to implement the personalized learning, learning path recommendation sequentially recommends personalized learning items such as lectures and exercises. Advances in deep learning, particularly deep reinforcement learning, have made modeling such recommendations more practical and effective. This paper proposes a multi-task LSTM model that enhances learning path recommendation by leveraging shared information across tasks. The approach reframes learning path recommendation as a sequence-to-sequence (Seq2Seq) prediction problem, generating personalized learning paths from a learner's historical interactions. The model uses a shared LSTM layer to capture common features for both learning path recommendation and deep knowledge tracing, along with task-specific LSTM layers for each objective. To avoid redundant recommendations, a non-repeat loss penalizes repeated items within the recommended learning path. Experiments on the ASSIST09 dataset show that the proposed model significantly outperforms baseline methods for the learning path recommendation.
CLJun 29, 2025
Ensemble BERT for Medication Event Classification on Electronic Health Records (EHRs)Shouvon Sarker, Xishuang Dong, Lijun Qian
Identification of key variables such as medications, diseases, relations from health records and clinical notes has a wide range of applications in the clinical domain. n2c2 2022 provided shared tasks on challenges in natural language processing for clinical data analytics on electronic health records (EHR), where it built a comprehensive annotated clinical data Contextualized Medication Event Dataset (CMED). This study focuses on subtask 2 in Track 1 of this challenge that is to detect and classify medication events from clinical notes through building a novel BERT-based ensemble model. It started with pretraining BERT models on different types of big data such as Wikipedia and MIMIC. Afterwards, these pretrained BERT models were fine-tuned on CMED training data. These fine-tuned BERT models were employed to accomplish medication event classification on CMED testing data with multiple predictions. These multiple predictions generated by these fine-tuned BERT models were integrated to build final prediction with voting strategies. Experimental results demonstrated that BERT-based ensemble models can effectively improve strict Micro-F score by about 5% and strict Macro-F score by about 6%, respectively.
CYApr 17, 2025
Knowledge Acquisition on Mass-shooting Events via LLMs for AI-Driven JusticeBenign John Ihugba, Afsana Nasrin, Ling Wu et al.
Mass-shooting events pose a significant challenge to public safety, generating large volumes of unstructured textual data that hinder effective investigations and the formulation of public policy. Despite the urgency, few prior studies have effectively automated the extraction of key information from these events to support legal and investigative efforts. This paper presented the first dataset designed for knowledge acquisition on mass-shooting events through the application of named entity recognition (NER) techniques. It focuses on identifying key entities such as offenders, victims, locations, and criminal instruments, that are vital for legal and investigative purposes. The NER process is powered by Large Language Models (LLMs) using few-shot prompting, facilitating the efficient extraction and organization of critical information from diverse sources, including news articles, police reports, and social media. Experimental results on real-world mass-shooting corpora demonstrate that GPT-4o is the most effective model for mass-shooting NER, achieving the highest Micro Precision, Micro Recall, and Micro F1-scores. Meanwhile, o1-mini delivers competitive performance, making it a resource-efficient alternative for less complex NER tasks. It is also observed that increasing the shot count enhances the performance of all models, but the gains are more substantial for GPT-4o and o1-mini, highlighting their superior adaptability to few-shot learning scenarios.
LGJan 4, 2022
Integrating Human-in-the-loop into Swarm Learning for Decentralized Fake News DetectionXishuang Dong, Lijun Qian
Social media has become an effective platform to generate and spread fake news that can mislead people and even distort public opinion. Centralized methods for fake news detection, however, cannot effectively protect user privacy during the process of centralized data collection for training models. Moreover, it cannot fully involve user feedback in the loop of learning detection models for further enhancing fake news detection. To overcome these challenges, this paper proposed a novel decentralized method, Human-in-the-loop Based Swarm Learning (HBSL), to integrate user feedback into the loop of learning and inference for recognizing fake news without violating user privacy in a decentralized manner. It consists of distributed nodes that are able to independently learn and detect fake news on local data. Furthermore, detection models trained on these nodes can be enhanced through decentralized model merging. Experimental results demonstrate that the proposed method outperforms the state-of-the-art decentralized method in regard of detecting fake news on a benchmark dataset.
NIApr 19, 2021
A Joint Energy and Latency Framework for Transfer Learning over 5G Industrial Edge NetworksBo Yang, Omobayode Fagbohungbe, Xuelin Cao et al.
In this paper, we propose a transfer learning (TL)-enabled edge-CNN framework for 5G industrial edge networks with privacy-preserving characteristic. In particular, the edge server can use the existing image dataset to train the CNN in advance, which is further fine-tuned based on the limited datasets uploaded from the devices. With the aid of TL, the devices that are not participating in the training only need to fine-tune the trained edge-CNN model without training from scratch. Due to the energy budget of the devices and the limited communication bandwidth, a joint energy and latency problem is formulated, which is solved by decomposing the original problem into an uploading decision subproblem and a wireless bandwidth allocation subproblem. Experiments using ImageNet demonstrate that the proposed TL-enabled edge-CNN framework can achieve almost 85% prediction accuracy of the baseline by uploading only about 1% model parameters, for a compression ratio of 32 of the autoencoder.
IVFeb 27, 2021
Semi-supervised Learning for COVID-19 Image Classification via ResNetLucy Nwosu, Xiangfang Li, Lijun Qian et al.
Coronavirus disease 2019 (COVID-19) is an ongoing global pandemic in over 200 countries and territories, which has resulted in a great public health concern across the international community. Analysis of X-ray imaging data can play a critical role in timely and accurate screening and fighting against COVID-19. Supervised deep learning has been successfully applied to recognize COVID-19 pathology from X-ray imaging datasets. However, it requires a substantial amount of annotated X-ray images to train models, which is often not applicable to data analysis for emerging events such as COVID-19 outbreak, especially in the early stage of the outbreak. To address this challenge, this paper proposes a two-path semi-supervised deep learning model, ssResNet, based on Residual Neural Network (ResNet) for COVID-19 image classification, where two paths refer to a supervised path and an unsupervised path, respectively. Moreover, we design a weighted supervised loss that assigns higher weight for the minority classes in the training process to resolve the data imbalance. Experimental results on a large-scale of X-ray image dataset COVIDx demonstrate that the proposed model can achieve promising performance even when trained on very few labeled training images.
MLJan 28, 2021
A Survey of Complex-Valued Neural NetworksJoshua Bassey, Lijun Qian, Xianfang Li
Artificial neural networks (ANNs) based machine learning models and especially deep learning models have been widely applied in computer vision, signal processing, wireless communications, and many other domains, where complex numbers occur either naturally or by design. However, most of the current implementations of ANNs and machine learning frameworks are using real numbers rather than complex numbers. There are growing interests in building ANNs using complex numbers, and exploring the potential advantages of the so-called complex-valued neural networks (CVNNs) over their real-valued counterparts. In this paper, we discuss the recent development of CVNNs by performing a survey of the works on CVNNs in the literature. Specifically, a detailed review of various CVNNs in terms of activation function, learning and optimization, input and output representations, and their applications in tasks such as signal processing and computer vision are provided, followed by a discussion on some pertinent challenges and future research directions.
LGNov 24, 2020
Benchmarking Inference Performance of Deep Learning Models on Analog DevicesOmobayode Fagbohungbe, Lijun Qian
Analog hardware implemented deep learning models are promising for computation and energy constrained systems such as edge computing devices. However, the analog nature of the device and the associated many noise sources will cause changes to the value of the weights in the trained deep learning models deployed on such devices. In this study, systematic evaluation of the inference performance of trained popular deep learning models for image classification deployed on analog devices has been carried out, where additive white Gaussian noise has been added to the weights of the trained models during inference. It is observed that deeper models and models with more redundancy in design such as VGG are more robust to the noise in general. However, the performance is also affected by the design philosophy of the model, the detailed structure of the model, the exact machine learning task, as well as the datasets.
IVAug 18, 2020
Offloading Optimization in Edge Computing for Deep Learning Enabled Target Tracking by Internet-of-UAVsBo Yang, Xuelin Cao, Chau Yuen et al.
The empowering unmanned aerial vehicles (UAVs) have been extensively used in providing intelligence such as target tracking. In our field experiments, a pre-trained convolutional neural network (CNN) is deployed at the UAV to identify a target (a vehicle) from the captured video frames and enable the UAV to keep tracking. However, this kind of visual target tracking demands a lot of computational resources due to the desired high inference accuracy and stringent delay requirement. This motivates us to consider offloading this type of deep learning (DL) tasks to a mobile edge computing (MEC) server due to limited computational resource and energy budget of the UAV, and further improve the inference accuracy. Specifically, we propose a novel hierarchical DL tasks distribution framework, where the UAV is embedded with lower layers of the pre-trained CNN model, while the MEC server with rich computing resources will handle the higher layers of the CNN model. An optimization problem is formulated to minimize the weighted-sum cost including the tracking delay and energy consumption introduced by communication and computing of the UAVs, while taking into account the quality of data (e.g., video frames) input to the DL model and the inference errors. Analytical results are obtained and insights are provided to understand the tradeoff between the weighted-sum cost and inference error rate in the proposed framework. Numerical results demonstrate the effectiveness of the proposed offloading framework.
SPJun 29, 2020
Computation Offloading in Multi-Access Edge Computing Networks: A Multi-Task Learning ApproachBo Yang, Xuelin Cao, Joshua Bassey et al.
Multi-access edge computing (MEC) has already shown the potential in enabling mobile devices to bear the computation-intensive applications by offloading some tasks to a nearby access point (AP) integrated with a MEC server (MES). However, due to the varying network conditions and limited computation resources of the MES, the offloading decisions taken by a mobile device and the computational resources allocated by the MES may not be efficiently achieved with the lowest cost. In this paper, we propose a dynamic offloading framework for the MEC network, in which the uplink non-orthogonal multiple access (NOMA) is used to enable multiple devices to upload their tasks via the same frequency band. We formulate the offloading decision problem as a multiclass classification problem and formulate the MES computational resource allocation problem as a regression problem. Then a multi-task learning based feedforward neural network (MTFNN) model is designed to jointly optimize the offloading decision and computational resource allocation. Numerical results illustrate that the proposed MTFNN outperforms the conventional optimization method in terms of inference accuracy and computation complexity.
LGJun 27, 2020
Lessons Learned from Accident of Autonomous Vehicle Testing: An Edge Learning-aided Offloading FrameworkBo Yang, Xuelin Cao, Xiangfang Li et al.
This letter proposes an edge learning-based offloading framework for autonomous driving, where the deep learning tasks can be offloaded to the edge server to improve the inference accuracy while meeting the latency constraint. Since the delay and the inference accuracy are incurred by wireless communications and computing, an optimization problem is formulated to maximize the inference accuracy subject to the offloading probability, the pre-braking probability, and data quality. Simulations demonstrate the superiority of the proposed offloading framework.
LGMay 10, 2020
Efficient Privacy Preserving Edge Computing Framework for Image ClassificationOmobayode Fagbohungbe, Sheikh Rufsan Reza, Xishuang Dong et al.
In order to extract knowledge from the large data collected by edge devices, traditional cloud based approach that requires data upload may not be feasible due to communication bandwidth limitation as well as privacy and security concerns of end users. To address these challenges, a novel privacy preserving edge computing framework is proposed in this paper for image classification. Specifically, autoencoder will be trained unsupervised at each edge device individually, then the obtained latent vectors will be transmitted to the edge server for the training of a classifier. This framework would reduce the communications overhead and protect the data of the end users. Comparing to federated learning, the training of the classifier in the proposed framework does not subject to the constraints of the edge devices, and the autoencoder can be trained independently at each edge device without any server involvement. Furthermore, the privacy of the end users' data is protected by transmitting latent vectors without additional cost of encryption. Experimental results provide insights on the image classification performance vs. various design parameters such as the data compression ratio of the autoencoder and the model complexity.
GNMay 6, 2020
Cell Type Identification from Single-Cell Transcriptomic Data via Semi-supervised LearningXishuang Dong, Shanta Chowdhury, Uboho Victor et al.
Cell type identification from single-cell transcriptomic data is a common goal of single-cell RNA sequencing (scRNAseq) data analysis. Neural networks have been employed to identify cell types from scRNAseq data with high performance. However, it requires a large mount of individual cells with accurate and unbiased annotated types to build the identification models. Unfortunately, labeling the scRNAseq data is cumbersome and time-consuming as it involves manual inspection of marker genes. To overcome this challenge, we propose a semi-supervised learning model to use unlabeled scRNAseq cells and limited amount of labeled scRNAseq cells to implement cell identification. Firstly, we transform the scRNAseq cells to "gene sentences", which is inspired by similarities between natural language system and gene system. Then genes in these sentences are represented as gene embeddings to reduce data sparsity. With these embeddings, we implement a semi-supervised learning model based on recurrent convolutional neural networks (RCNN), which includes a shared network, a supervised network and an unsupervised network. The proposed model is evaluated on macosko2015, a large scale single-cell transcriptomic dataset with ground truth of individual cell types. It is observed that the proposed model is able to achieve encouraging performance by learning on very limited amount of labeled scRNAseq cells together with a large number of unlabeled scRNAseq cells.
LGApr 26, 2020
Ensemble Deep Learning on Time-Series Representation of Tweets for Rumor Detection in Social MediaChandra Mouli Madhav Kotteti, Xishuang Dong, Lijun Qian
Social media is a popular platform for timely information sharing. One of the important challenges for social media platforms like Twitter is whether to trust news shared on them when there is no systematic news verification process. On the other hand, timely detection of rumors is a non-trivial task, given the fast-paced social media environment. In this work, we proposed an ensemble model, which performs majority-voting on a collection of predictions by deep neural networks using time-series vector representation of Twitter data for timely detection of rumors. By combining the proposed data pre-processing method with the ensemble model, better performance of rumor detection has been demonstrated in the experiments using PHEME dataset. Experimental results show that the classification performance has been improved by 7.9% in terms of micro F1 score compared to the baselines.
SPApr 19, 2020
Device Authentication Codes based on RF Fingerprinting using Deep LearningJoshua Bassey, Xiangfang Li, Lijun Qian
In this paper, we propose Device Authentication Code (DAC), a novel method for authenticating IoT devices with wireless interface by exploiting their radio frequency (RF) signatures. The proposed DAC is based on RF fingerprinting, information theoretic method, feature learning, and discriminatory power of deep learning. Specifically, an autoencoder is used to automatically extract features from the RF traces, and the reconstruction error is used as the DAC and this DAC is unique to the device and the particular message of interest. Then Kolmogorov-Smirnov (K-S) test is used to match the distribution of the reconstruction error generated by the autoencoder and the received message, and the result will determine whether the device of interest belongs to an authorized user. We validate this concept on two experimentally collected RF traces from six ZigBee and five universal software defined radio peripheral (USRP) devices, respectively. The traces span a range of Signalto- Noise Ratio by varying locations and mobility of the devices and channel interference and noise to ensure robustness of the model. Experimental results demonstrate that DAC is able to prevent device impersonation by extracting salient features that are unique to any wireless device of interest and can be used to identify RF devices. Furthermore, the proposed method does not need the RF traces of the intruder during model training yet be able to identify devices not seen during training, which makes it practical.
CLJan 31, 2020
Two-path Deep Semi-supervised Learning for Timely Fake News DetectionXishuang Dong, Uboho Victor, Lijun Qian
News in social media such as Twitter has been generated in high volume and speed. However, very few of them are labeled (as fake or true news) by professionals in near real time. In order to achieve timely detection of fake news in social media, a novel framework of two-path deep semi-supervised learning is proposed where one path is for supervised learning and the other is for unsupervised learning. The supervised learning path learns on the limited amount of labeled data while the unsupervised learning path is able to learn on a huge amount of unlabeled data. Furthermore, these two paths implemented with convolutional neural networks (CNN) are jointly optimized to complete semi-supervised learning. In addition, we build a shared CNN to extract the low level features on both labeled data and unlabeled data to feed them into these two paths. To verify this framework, we implement a Word CNN based semi-supervised learning model and test it on two datasets, namely, LIAR and PHEME. Experimental results demonstrate that the model built on the proposed framework can recognize fake news effectively with very few labeled data.
CLJun 10, 2019
Deep Two-path Semi-supervised Learning for Fake News DetectionXishuang Dong, Uboho Victor, Shanta Chowdhury et al.
News in social media such as Twitter has been generated in high volume and speed. However, very few of them can be labeled (as fake or true news) in a short time. In order to achieve timely detection of fake news in social media, a novel deep two-path semi-supervised learning model is proposed, where one path is for supervised learning and the other is for unsupervised learning. These two paths implemented with convolutional neural networks are jointly optimized to enhance detection performance. In addition, we build a shared convolutional neural networks between these two paths to share the low level features. Experimental results using Twitter datasets show that the proposed model can recognize fake news effectively with very few labeled data.
CVMar 30, 2018
Hierarchical Transfer Convolutional Neural Networks for Image ClassificationXishuang Dong, Hsiang-Huang Wu, Yuzhong Yan et al.
In this paper, we address the issue of how to enhance the generalization performance of convolutional neural networks (CNN) in the early learning stage for image classification. This is motivated by real-time applications that require the generalization performance of CNN to be satisfactory within limited training time. In order to achieve this, a novel hierarchical transfer CNN framework is proposed. It consists of a group of shallow CNNs and a cloud CNN, where the shallow CNNs are trained firstly and then the first layers of the trained shallow CNNs are used to initialize the first layer of the cloud CNN. This method will boost the generalization performance of the cloud CNN significantly, especially during the early stage of training. Experiments using CIFAR-10 and ImageNet datasets are performed to examine the proposed method. Results demonstrate the improvement of testing accuracy is 12% on average and as much as 20% for the CIFAR-10 case while 5% testing accuracy improvement for the ImageNet case during the early stage of learning. It is also shown that universal improvements of testing accuracy are obtained across different settings of dropout and number of shallow CNNs.