Deepu John

LG
h-index23
23papers
432citations
Novelty47%
AI Score56

23 Papers

CVAug 20, 2024Code
HiRED: Attention-Guided Token Dropping for Efficient Inference of High-Resolution Vision-Language Models

Kazi Hasan Ibn Arif, JinYi Yoon, Dimitrios S. Nikolopoulos et al.

High-resolution Vision-Language Models (VLMs) are widely used in multimodal tasks to enhance accuracy by preserving detailed image information. However, these models often generate an excessive number of visual tokens due to the need to encode multiple partitions of a high-resolution image input. Processing such a large number of visual tokens through multiple transformer networks poses significant computational challenges, particularly for resource-constrained commodity GPUs. To address this challenge, we propose High-Resolution Early Dropping (HiRED), a plug-and-play token-dropping method designed to operate within a fixed token budget. HiRED leverages the attention of CLS token in the vision transformer (ViT) to assess the visual content of the image partitions and allocate an optimal token budget for each partition accordingly. The most informative visual tokens from each partition within the allocated budget are then selected and passed to the subsequent Large Language Model (LLM). We showed that HiRED achieves superior accuracy and performance, compared to existing token-dropping methods. Empirically, HiRED-20% (i.e., a 20% token budget) on LLaVA-Next-7B achieves a 4.7x increase in token generation throughput, reduces response latency by 78%, and saves 14% of GPU memory for single inference on an NVIDIA TESLA P40 (24 GB). For larger batch sizes (e.g., 4), HiRED-20% prevents out-of-memory errors by cutting memory usage by 30%, while preserving throughput and latency benefits. Code - https://github.com/hasanar1f/HiRED

LGMar 1, 2022
A predictive analytics approach for stroke prediction using machine learning and neural networks

Soumyabrata Dev, Hewei Wang, Chidozie Shamrock Nwosu et al.

The negative impact of stroke in society has led to concerted efforts to improve the management and diagnosis of stroke. With an increased synergy between technology and medical diagnosis, caregivers create opportunities for better patient management by systematically mining and archiving the patients' medical records. Therefore, it is vital to study the interdependency of these risk factors in patients' health records and understand their relative contribution to stroke prediction. This paper systematically analyzes the various factors in electronic health records for effective stroke prediction. Using various statistical techniques and principal component analysis, we identify the most important factors for stroke prediction. We conclude that age, heart disease, average glucose level, and hypertension are the most important factors for detecting stroke in patients. Furthermore, a perceptron neural network using these four attributes provides the highest accuracy rate and lowest miss rate compared to using all available input features and other benchmarking algorithms. As the dataset is highly imbalanced concerning the occurrence of stroke, we report our results on a balanced dataset created via sub-sampling techniques.

LGSep 15, 2023Code
POCKET: Pruning Random Convolution Kernels for Time Series Classification from a Feature Selection Perspective

Shaowu Chen, Weize Sun, Lei Huang et al.

In recent years, two competitive time series classification models, namely, ROCKET and MINIROCKET, have garnered considerable attention due to their low training cost and high accuracy. However, they rely on a large number of random 1-D convolutional kernels to comprehensively capture features, which is incompatible with resource-constrained devices. Despite the development of heuristic algorithms designed to recognize and prune redundant kernels, the inherent time-consuming nature of evolutionary algorithms hinders efficient evaluation. To efficiently prune models, this paper eliminates feature groups contributing minimally to the classifier, thereby discarding the associated random kernels without direct evaluation. To this end, we incorporate both group-level ($l_{2,1}$-norm) and element-level ($l_2$-norm) regularizations to the classifier, formulating the pruning challenge as a group elastic net classification problem. An ADMM-based algorithm is initially introduced to solve the problem, but it is computationally intensive. Building on the ADMM-based algorithm, we then propose our core algorithm, POCKET, which significantly speeds up the process by dividing the task into two sequential stages. In Stage 1, POCKET utilizes dynamically varying penalties to efficiently achieve group sparsity within the classifier, removing features associated with zero weights and their corresponding kernels. In Stage 2, the remaining kernels and features are used to refit a $l_2$-regularized classifier for enhanced performance. Experimental results on diverse time series datasets show that POCKET prunes up to 60% of kernels without a significant reduction in accuracy and performs 11$\times$ faster than its counterparts. Our code is publicly available at https://github.com/ShaowuChen/POCKET.

1.5LGJun 1
EEG-FuseFormer: A Transformer-Driven Feature Fusion Framework for Seizure Onset Prediction

Vigneshwar Hariharan, Chithra Reghuvaran, Arlene John et al.

Epilepsy is one of the most common neurological disorders globally, characterized by recurring seizures and significantly impacting the quality of life. Despite advancements in diagnostic techniques, the mitigation of risks faced by epilepsy patients remains challenging due to the unpredictability of seizure events. An accurate forecast of seizure onset helps to reduce risks in epilepsy patients. In this paper, we propose EEG-FuseFormer, a transformer-based feature fusion framework for seizure-onset prediction that combines intermediate features extracted from Convolutional Neural Networks-Long Short-Term Memory (CNN-LSTM) and ResNet-18 networks. The CNN-LSTM architecture captures both spatial and temporal features directly from the raw signal, whereas the ResNet-18 extracts features from the Short-Time Fourier Transform (STFT) representation of the EEG signals. Fusion is carried out using a transformer encoder, and the final prediction is generated using fully connected dense layers. The CHB-MIT dataset was used to validate the proposed model. The results show that the proposed model achieves a mean recall of 98.85% and outperforms most of the state-of-the-art methods. This study evaluates the ability of the proposed feature fusion model to generalize in cross-patient testing scenarios. Fine-tuning pre-trained models on limited target patient data (target adaptation) within the cross-patient validation framework results in higher recall, precision, and F1-score metrics in comparison to the conventional cross-patient validation approach. Finally, the runtime-based computational complexity of the model is assessed across diverse hardware platforms to highlight the performance-complexity trade-off.

CVOct 17, 2023
Unsupervised Pre-Training Using Masked Autoencoders for ECG Analysis

Guoxin Wang, Qingyuan Wang, Ganesh Neelakanta Iyer et al.

Unsupervised learning methods have become increasingly important in deep learning due to their demonstrated large utilization of datasets and higher accuracy in computer vision and natural language processing tasks. There is a growing trend to extend unsupervised learning methods to other domains, which helps to utilize a large amount of unlabelled data. This paper proposes an unsupervised pre-training technique based on masked autoencoder (MAE) for electrocardiogram (ECG) signals. In addition, we propose a task-specific fine-tuning to form a complete framework for ECG analysis. The framework is high-level, universal, and not individually adapted to specific model architectures or tasks. Experiments are conducted using various model architectures and large-scale datasets, resulting in an accuracy of 94.39% on the MITDB dataset for ECG arrhythmia classification task. The result shows a better performance for the classification of previously unseen data for the proposed approach compared to fully supervised methods.

SPJun 14, 2022
Classification of ECG based on Hybrid Features using CNNs for Wearable Applications

Li Xiaolin, Fang Xiang, Rajesh C. Panicker et al.

Sudden cardiac death and arrhythmia account for a large percentage of all deaths worldwide. Electrocardiography (ECG) is the most widely used screening tool for cardiovascular diseases. Traditionally, ECG signals are classified manually, requiring experience and great skill, while being time-consuming and prone to error. Thus machine learning algorithms have been widely adopted because of their ability to perform complex data analysis. Features derived from the points of interest in ECG - mainly Q, R, and S, are widely used for arrhythmia detection. In this work, we demonstrate improved performance for ECG classification using hybrid features and three different models, building on a 1-D convolutional neural network (CNN) model that we had proposed in the past. An RR interval features based model proposed in this work achieved an accuracy of 98.98%, which is an improvement over the baseline model. To make the model immune to noise, we updated the model using frequency features and achieved good sustained performance in presence of noise with a slightly lower accuracy of 98.69%. Further, another model combining the frequency features and the RR interval features was developed, which achieved a high accuracy of 99% with good sustained performance in noisy environments. Due to its high accuracy and noise immunity, the proposed model which combines multiple hybrid features, is well suited for ambulatory wearable sensing applications.

SPJun 14, 2022
Atrial Fibrillation Detection Using Weight-Pruned, Log-Quantised Convolutional Neural Networks

Xiu Qi Chang, Ann Feng Chew, Benjamin Chen Ming Choong et al.

Deep neural networks (DNN) are a promising tool in medical applications. However, the implementation of complex DNNs on battery-powered devices is challenging due to high energy costs for communication. In this work, a convolutional neural network model is developed for detecting atrial fibrillation from electrocardiogram (ECG) signals. The model demonstrates high performance despite being trained on limited, variable-length input data. Weight pruning and logarithmic quantisation are combined to introduce sparsity and reduce model size, which can be exploited for reduced data movement and lower computational complexity. The final model achieved a 91.1% model compression ratio while maintaining high model accuracy of 91.7% and less than 1% loss.

CVJul 20, 2025Code
Polymorph: Energy-Efficient Multi-Label Classification for Video Streams on Embedded Devices

Saeid Ghafouri, Mohsen Fayyaz, Xiangchen Li et al.

Real-time multi-label video classification on embedded devices is constrained by limited compute and energy budgets. Yet, video streams exhibit structural properties such as label sparsity, temporal continuity, and label co-occurrence that can be leveraged for more efficient inference. We introduce Polymorph, a context-aware framework that activates a minimal set of lightweight Low Rank Adapters (LoRA) per frame. Each adapter specializes in a subset of classes derived from co-occurrence patterns and is implemented as a LoRA weight over a shared backbone. At runtime, Polymorph dynamically selects and composes only the adapters needed to cover the active labels, avoiding full-model switching and weight merging. This modular strategy improves scalability while reducing latency and energy overhead. Polymorph achieves 40% lower energy consumption and improves mAP by 9 points over strong baselines on the TAO dataset. Polymorph is open source at https://github.com/inference-serving/polymorph/.

CVAug 7, 2025Code
Optimal Brain Connection: Towards Efficient Structural Pruning

Shaowu Chen, Wei Ma, Binhua Huang et al.

Structural pruning has been widely studied for its effectiveness in compressing neural networks. However, existing methods often neglect the interconnections among parameters. To address this limitation, this paper proposes a structural pruning framework termed Optimal Brain Connection. First, we introduce the Jacobian Criterion, a first-order metric for evaluating the saliency of structural parameters. Unlike existing first-order methods that assess parameters in isolation, our criterion explicitly captures both intra-component interactions and inter-layer dependencies. Second, we propose the Equivalent Pruning mechanism, which utilizes autoencoders to retain the contributions of all original connection--including pruned ones--during fine-tuning. Experimental results demonstrate that the Jacobian Criterion outperforms several popular metrics in preserving model performance, while the Equivalent Pruning mechanism effectively mitigates performance degradation after fine-tuning. Code: https://github.com/ShaowuChen/Optimal_Brain_Connection

ARJul 22, 2024
KWT-Tiny: RISC-V Accelerated, Embedded Keyword Spotting Transformer

Aness Al-Qawlaq, Ajay Kumar M, Deepu John

This paper explores the adaptation of Transformerbased models for edge devices through the quantisation and hardware acceleration of the ARM Keyword Transformer (KWT) model on a RISC-V platform. The model was targeted to run on 64kB RAM in bare-metal C using a custom-developed edge AI library. KWT-1 was retrained to be 369 times smaller, with only a 10% loss in accuracy through reducing output classes from 35 to 2. The retraining and quantisation reduced model size from 2.42 MB to 1.65 kB. The integration of custom RISC-V instructions that accelerated GELU and SoftMax operations enabled a 5x speedup and thus ~5x power reduction in inference, with inference clock cycle counts decreasing from 26 million to 5.5 million clock cycles while incurring a small area overhead of approximately 29%. The results demonstrate a viable method for porting and accelerating Transformer-based models in low-power IoT devices.

DCJan 15
WISP: Waste- and Interference-Suppressed Distributed Speculative LLM Serving at the Edge via Dynamic Drafting and SLO-Aware Batching

Xiangchen Li, Jiakun Fan, Qingyuan Wang et al.

As Large Language Models (LLMs) become increasingly accessible to end users, an ever-growing number of inference requests are initiated from edge devices and computed on centralized GPU clusters. However, the resulting exponential growth in computation workload is placing significant strain on data centers, while edge devices remain largely underutilized, leading to imbalanced workloads and resource inefficiency across the network. Integrating edge devices into the LLM inference process via speculative decoding helps balance the workload between the edge and the cloud, while maintaining lossless prediction accuracy. In this paper, we identify and formalize two critical bottlenecks that limit the efficiency and scalability of distributed speculative LLM serving: Wasted Drafting Time and Verification Interference. To address these challenges, we propose WISP, an efficient and SLO-aware distributed LLM inference system that consists of an intelligent speculation controller, a verification time estimator, and a verification batch scheduler. These components collaboratively enhance drafting efficiency and optimize verification request scheduling on the server. Extensive numerical results show that WISP improves system capacity by up to 2.1x and 4.1x, and increases system goodput by up to 1.94x and 3.7x, compared to centralized serving and SLED, respectively.

AIMar 26, 2024
Tiny Models are the Computational Saver for Large Models

Qingyuan Wang, Barry Cardiff, Antoine Frappé et al.

This paper introduces TinySaver, an early-exit-like dynamic model compression approach which employs tiny models to substitute large models adaptively. Distinct from traditional compression techniques, dynamic methods like TinySaver can leverage the difficulty differences to allow certain inputs to complete their inference processes early, thereby conserving computational resources. Most existing early exit designs are implemented by attaching additional network branches to the model's backbone. Our study, however, reveals that completely independent tiny models can replace a substantial portion of the larger models' job with minimal impact on performance. Employing them as the first exit can remarkably enhance computational efficiency. By searching and employing the most appropriate tiny model as the computational saver for a given large model, the proposed approaches work as a novel and generic method to model compression. This finding will help the research community in exploring new compression methods to address the escalating computational demands posed by rapidly evolving AI models. Our evaluation of this approach in ImageNet-1k classification demonstrates its potential to reduce the number of compute operations by up to 90\%, with only negligible losses in performance, across various modern vision models.

DCJun 11, 2025
SLED: A Speculative LLM Decoding Framework for Efficient Edge Serving

Xiangchen Li, Dimitrios Spatharakis, Saeid Ghafouri et al.

The growing gap between the increasing complexity of large language models (LLMs) and the limited computational budgets of edge devices poses a key challenge for efficient on-device inference, despite gradual improvements in hardware capabilities. Existing strategies, such as aggressive quantization, pruning, or remote inference, trade accuracy for efficiency or lead to substantial cost burdens. This position paper introduces a new framework that leverages speculative decoding, previously viewed primarily as a decoding acceleration technique for autoregressive generation of LLMs, as a promising approach specifically adapted for edge computing by orchestrating computation across heterogeneous devices. We propose \acronym, a framework that allows lightweight edge devices to draft multiple candidate tokens locally using diverse draft models, while a single, shared edge server verifies the tokens utilizing a more precise target model. To further increase the efficiency of verification, the edge server batch the diverse verification requests from devices. This approach supports device heterogeneity and reduces server-side memory footprint by sharing the same upstream target model across multiple devices. Our initial experiments with Jetson Orin Nano, Raspberry Pi 4B/5, and an edge server equipped with 4 Nvidia A100 GPUs indicate substantial benefits: 2.2 more system throughput, 2.8 more system capacity, and better cost efficiency, all without sacrificing model accuracy.

LGMar 4, 2024
DyCE: Dynamically Configurable Exiting for Deep Learning Compression and Real-time Scaling

Qingyuan Wang, Barry Cardiff, Antoine Frappé et al.

Conventional deep learning (DL) model compression and scaling methods focus on altering the model's components, impacting the results across all samples uniformly. However, since samples vary in difficulty, a dynamic model that adapts computation based on sample complexity offers a novel perspective for compression and scaling. Despite this potential, existing dynamic models are typically monolithic and model-specific, limiting their generalizability as broad compression and scaling methods. Additionally, most deployed DL systems are fixed, unable to adjust their scale once deployed and, therefore, cannot adapt to the varying real-time demands. This paper introduces DyCE, a dynamically configurable system that can adjust the performance-complexity trade-off of a DL model at runtime without requiring re-initialization or redeployment on inference hardware. DyCE achieves this by adding small exit networks to intermediate layers of the original model, allowing computation to terminate early if acceptable results are obtained. DyCE also decouples the design of an efficient dynamic model, facilitating easy adaptation to new base models and potential general use in compression and scaling. We also propose methods for generating optimized configurations and determining the types and positions of exit networks to achieve desired performance and complexity trade-offs. By enabling simple configuration switching, DyCE provides fine-grained performance tuning in real-time. We demonstrate the effectiveness of DyCE through image classification tasks using deep convolutional neural networks (CNNs). DyCE significantly reduces computational complexity by 23.5% for ResNet152 and 25.9% for ConvNextv2-tiny on ImageNet, with accuracy reductions of less than 0.5%.

CVSep 3, 2025
TinyDrop: Tiny Model Guided Token Dropping for Vision Transformers

Guoxin Wang, Qingyuan Wang, Binhua Huang et al.

Vision Transformers (ViTs) achieve strong performance in image classification but incur high computational costs from processing all image tokens. To reduce inference costs in large ViTs without compromising accuracy, we propose TinyDrop, a training-free token dropping framework guided by a lightweight vision model. The guidance model estimates the importance of tokens while performing inference, thereby selectively discarding low-importance tokens if large vit models need to perform attention calculations. The framework operates plug-and-play, requires no architectural modifications, and is compatible with diverse ViT architectures. Evaluations on standard image classification benchmarks demonstrate that our framework reduces FLOPs by up to 80% for ViTs with minimal accuracy degradation, highlighting its generalization capability and practical utility for efficient ViT-based classification.

DCJun 30, 2025
QPART: Adaptive Model Quantization and Dynamic Workload Balancing for Accuracy-aware Edge Inference

Xiangchen Li, Saeid Ghafouri, Bo Ji et al.

As machine learning inferences increasingly move to edge devices, adapting to diverse computational capabilities, hardware, and memory constraints becomes more critical. Instead of relying on a pre-trained model fixed for all future inference queries across diverse edge devices, we argue that planning an inference pattern with a request-specific model tailored to the device's computational capacity, accuracy requirements, and time constraints is more cost-efficient and robust to diverse scenarios. To this end, we propose an accuracy-aware and workload-balanced inference system that integrates joint model quantization and inference partitioning. In this approach, the server dynamically responds to inference queries by sending a quantized model and adaptively sharing the inference workload with the device. Meanwhile, the device's computational power, channel capacity, and accuracy requirements are considered when deciding. Furthermore, we introduce a new optimization framework for the inference system, incorporating joint model quantization and partitioning. Our approach optimizes layer-wise quantization bit width and partition points to minimize time consumption and cost while accounting for varying accuracy requirements of tasks through an accuracy degradation metric in our optimization model. To our knowledge, this work represents the first exploration of optimizing quantization layer-wise bit-width in the inference serving system, by introducing theoretical measurement of accuracy degradation. Simulation results demonstrate a substantial reduction in overall time and power consumption, with computation payloads decreasing by over 80% and accuracy degradation kept below 1%.

CVMay 7, 2025
ORXE: Orchestrating Experts for Dynamically Configurable Efficiency

Qingyuan Wang, Guoxin Wang, Barry Cardiff et al.

This paper presents ORXE, a modular and adaptable framework for achieving real-time configurable efficiency in AI models. By leveraging a collection of pre-trained experts with diverse computational costs and performance levels, ORXE dynamically adjusts inference pathways based on the complexity of input samples. Unlike conventional approaches that require complex metamodel training, ORXE achieves high efficiency and flexibility without complicating the development process. The proposed system utilizes a confidence-based gating mechanism to allocate appropriate computational resources for each input. ORXE also supports adjustments to the preference between inference cost and prediction performance across a wide range during runtime. We implemented a training-free ORXE system for image classification tasks, evaluating its efficiency and accuracy across various devices. The results demonstrate that ORXE achieves superior performance compared to individual experts and other dynamic models in most cases. This approach can be extended to other applications, providing a scalable solution for diverse real-world deployment scenarios.

SPJan 31, 2025
DCentNet: Decentralized Multistage Biomedical Signal Classification using Early Exits

Xiaolin Li, Binhua Huang, Barry Cardiff et al.

DCentNet is a novel decentralized multistage signal classification approach designed for biomedical data from IoT wearable sensors, integrating early exit points (EEP) to enhance energy efficiency and processing speed. Unlike traditional centralized processing methods, which result in high energy consumption and latency, DCentNet partitions a single CNN model into multiple sub-networks using EEPs. By introducing encoder-decoder pairs at EEPs, the system compresses large feature maps before transmission, significantly reducing wireless data transfer and power usage. If an input is confidently classified at an EEP, processing stops early, optimizing efficiency. Initial sub-networks can be deployed on fog or edge devices to further minimize energy consumption. A genetic algorithm is used to optimize EEP placement, balancing performance and complexity. Experimental results on ECG classification show that with one EEP, DCentNet reduces wireless data transmission by 94.54% and complexity by 21%, while maintaining original accuracy and sensitivity. With two EEPs, sensitivity reaches 98.36%, accuracy 97.74%, wireless data transmission decreases by 91.86%, and complexity is reduced by 22%. Implemented on an ARM Cortex-M4 MCU, DCentNet achieves an average power saving of 73.6% compared to continuous wireless ECG transmission.

LGOct 19, 2021
Identifying Stroke Indicators Using Rough Sets

Muhammad Salman Pathan, Jianbiao Zhang, Deepu John et al.

Stroke is widely considered as the second most common cause of mortality. The adverse consequences of stroke have led to global interest and work for improving the management and diagnosis of stroke. Various techniques for data mining have been used globally for accurate prediction of occurrence of stroke based on the risk factors that are associated with the electronic health care records (EHRs) of the patients. In particular, EHRs routinely contain several thousands of features and most of them are redundant and irrelevant that need to be discarded to enhance the prediction accuracy. The choice of feature-selection methods can help in improving the prediction accuracy of the model and efficient data management of the archived input features. In this paper, we systematically analyze the various features in EHR records for the detection of stroke. We propose a novel rough-set based technique for ranking the importance of the various EHR records in detecting stroke. Unlike the conventional rough-set techniques, our proposed technique can be applied on any dataset that comprises binary feature sets. We evaluated our proposed method in a publicly available dataset of EHR, and concluded that age, average glucose level, heart disease, and hypertension were the most essential attributes for detecting stroke in patients. Furthermore, we benchmarked the proposed technique with other popular feature-selection techniques. We obtained the best performance in ranking the importance of individual features in detecting stroke.

LGAug 31, 2021
Multistage Pruning of CNN Based ECG Classifiers for Edge Devices

Xiaolin Li, Rajesh Panicker, Barry Cardiff et al.

Using smart wearable devices to monitor patients electrocardiogram (ECG) for real-time detection of arrhythmias can significantly improve healthcare outcomes. Convolutional neural network (CNN) based deep learning has been used successfully to detect anomalous beats in ECG. However, the computational complexity of existing CNN models prohibits them from being implemented in low-powered edge devices. Usually, such models are complex with lots of model parameters which results in large number of computations, memory, and power usage in edge devices. Network pruning techniques can reduce model complexity at the expense of performance in CNN models. This paper presents a novel multistage pruning technique that reduces CNN model complexity with negligible loss in performance compared to existing pruning techniques. An existing CNN model for ECG classification is used as a baseline reference. At 60% sparsity, the proposed technique achieves 97.7% accuracy and an F1 score of 93.59% for ECG classification tasks. This is an improvement of 3.3% and 9% for accuracy and F1 Score respectively, compared to traditional pruning with fine-tuning approach. Compared to the baseline model, we also achieve a 60.4% decrease in run-time complexity.

LGAug 25, 2021
SomnNET: An SpO2 Based Deep Learning Network for Sleep Apnea Detection in Smartwatches

Arlene John, Koushik Kumar Nundy, Barry Cardiff et al.

The abnormal pause or rate reduction in breathing is known as the sleep-apnea hypopnea syndrome and affects the quality of sleep of an individual. A novel method for the detection of sleep apnea events (pause in breathing) from peripheral oxygen saturation (SpO2) signals obtained from wearable devices is discussed in this paper. The paper details an apnea detection algorithm of a very high resolution on a per-second basis for which a 1-dimensional convolutional neural network -- which we termed SomnNET -- is developed. This network exhibits an accuracy of 97.08% and outperforms several lower resolution state-of-the-art apnea detection methods. The feasibility of model pruning and binarization to reduce the computational complexity is explored. The pruned network with 80% sparsity exhibited an accuracy of 89.75%, and the binarized network exhibited an accuracy of 68.22%. The performance of the proposed networks is compared against several state-of-the-art algorithms.

LGMay 2, 2021
A 1D-CNN Based Deep Learning Technique for Sleep Apnea Detection in IoT Sensors

Arlene John, Barry Cardiff, Deepu John

Internet of Things (IoT) enabled wearable sensors for health monitoring are widely used to reduce the cost of personal healthcare and improve quality of life. The sleep apnea-hypopnea syndrome, characterized by the abnormal reduction or pause in breathing, greatly affects the quality of sleep of an individual. This paper introduces a novel method for apnea detection (pause in breathing) from electrocardiogram (ECG) signals obtained from wearable devices. The novelty stems from the high resolution of apnea detection on a second-by-second basis, and this is achieved using a 1-dimensional convolutional neural network for feature extraction and detection of sleep apnea events. The proposed method exhibits an accuracy of 99.56% and a sensitivity of 96.05%. This model outperforms several lower resolution state-of-the-art apnea detection methods. The complexity of the proposed model is analyzed. We also analyze the feasibility of model pruning and binarization to reduce the resource requirements on a wearable IoT device. The pruned model with 80\% sparsity exhibited an accuracy of 97.34% and a sensitivity of 86.48%. The binarized model exhibited an accuracy of 75.59% and sensitivity of 63.23%. The performance of low complexity patient-specific models derived from the generic model is also studied to analyze the feasibility of retraining existing models to fit patient-specific requirements. The patient-specific models on average exhibited an accuracy of 97.79% and sensitivity of 92.23%. The source code for this work is made publicly available.

CRMay 2, 2021
Continuous User Authentication using IoT Wearable Sensors

Conor Smyth, Guoxin Wang, Rajesh Panicker et al.

Over the past several years, the electrocardiogram (ECG) has been investigated for its uniqueness and potential to discriminate between individuals. This paper discusses how this discriminatory information can help in continuous user authentication by a wearable chest strap which uses dry electrodes to obtain a single lead ECG signal. To the best of the authors' knowledge, this is the first such work which deals with continuous authentication using a genuine wearable device as most prior works have either used medical equipment employing gel electrodes to obtain an ECG signal or have obtained an ECG signal through electrode positions that would not be feasible using a wearable device. Prior works have also mainly dealt with using the ECG signal for identification rather than verification, or dealt with using the ECG signal for discrete authentication. This paper presents a novel algorithm which uses QRS detection, weighted averaging, Discrete Cosine Transform (DCT), and a Support Vector Machine (SVM) classifier to determine whether the wearer of the device should be positively verified or not. Zero intrusion attempts were successful when tested on a database consisting of 33 subjects.