Gari D. Clifford

h-index72

28papers

522citations

Novelty35%

AI Score52

Ranked #15,728 of 194,257 authors (top 8%)#3,967 in LG (top 10%)

28 Papers

5.9QMSep 25, 2024Code

ECG-Image-Database: A Dataset of ECG Images with Real-World Imaging and Scanning Artifacts; A Foundation for Computerized ECG Image Digitization and Analysis

Matthew A. Reyna, Deepanshi, James Weigle et al.

We introduce the ECG-Image-Database, a large and diverse collection of electrocardiogram (ECG) images generated from ECG time-series data, with real-world scanning, imaging, and physical artifacts. We used ECG-Image-Kit, an open-source Python toolkit, to generate realistic images of 12-lead ECG printouts from raw ECG time-series. The images include realistic distortions such as noise, wrinkles, stains, and perspective shifts, generated both digitally and physically. The toolkit was applied to 977 12-lead ECG records from the PTB-XL database and 1,000 from Emory Healthcare to create high-fidelity synthetic ECG images. These unique images were subjected to both programmatic distortions using ECG-Image-Kit and physical effects like soaking, staining, and mold growth, followed by scanning and photography under various lighting conditions to create real-world artifacts. The resulting dataset includes 35,595 software-labeled ECG images with a wide range of imaging artifacts and distortions. The dataset provides ground truth time-series data alongside the images, offering a reference for developing machine and deep learning models for ECG digitization and classification. The images vary in quality, from clear scans of clean papers to noisy photographs of degraded papers, enabling the development of more generalizable digitization algorithms. ECG-Image-Database addresses a critical need for digitizing paper-based and non-digital ECGs for computerized analysis, providing a foundation for developing robust machine and deep learning models capable of converting ECG images into time-series. The dataset aims to serve as a reference for ECG digitization and computerized annotation efforts. ECG-Image-Database was used in the PhysioNet Challenge 2024 on ECG image digitization and classification.

2.3SPJan 6, 2023

A Data-Driven Gaussian Process Filter for Electrocardiogram Denoising

Mircea Dumitru, Qiao Li, Erick Andres Perez Alday et al.

Objective: Gaussian Processes (GP)-based filters, which have been effectively used for various applications including electrocardiogram (ECG) filtering can be computationally demanding and the choice of their hyperparameters is typically ad hoc. Methods: We develop a data-driven GP filter to address both issues, using the notion of the ECG phase domain -- a time-warped representation of the ECG beats onto a fixed number of samples and aligned R-peaks, which is assumed to follow a Gaussian distribution. Under this assumption, the computation of the sample mean and covariance matrix is simplified, enabling an efficient implementation of the GP filter in a data-driven manner, with no ad hoc hyperparameters. The proposed filter is evaluated and compared with a state-of-the-art wavelet-based filter, on the PhysioNet QT Database. The performance is evaluated by measuring the signal-to-noise ratio (SNR) improvement of the filter at SNR levels ranging from -5 to 30dB, in 5dB steps, using additive noise. For a clinical evaluation, the error between the estimated QT-intervals of the original and filtered signals is measured and compared with the benchmark filter. Results: It is shown that the proposed GP filter outperforms the benchmark filter for all the tested noise levels. It also outperforms the state-of-the-art filter in terms of QT-interval estimation error bias and variance. Conclusion: The proposed GP filter is a versatile technique for preprocessing the ECG in clinical and research applications, is applicable to ECG of arbitrary lengths and sampling frequencies, and provides confidence intervals for its performance.

6.6AIMay 17

Prediction of Challenging Behaviors Associated with Profound Autism in a Classroom Setting Using Wearable Sensors

Yadhu Kartha, Conor Anderson, Jenny Foster et al.

Autism Spectrum Disorder (ASD) is characterized by challenges with social interaction and communication and by restricted or repetitive patterns of thought and behavior, with significant variability in presentation. Approximately a quarter of children with ASD are classified as having profound autism, who often exhibit challenging behaviors, such as self-injurious behavior, aggression, elopement, or pica, that pose serious safety risks and disrupt learning in educational settings. Prior work has applied wearable sensors and machine learning to detect challenging behaviors, but has been largely confined to controlled laboratory environments. This work demonstrates that predicting challenging behavior episodes is feasible in a real-world special education classroom. We collected approximately 110.7 hours of labeled multimodal wearable data comprising accelerometry, electrodermal activity (EDA), and skin temperature from 9 children and young adults aged 10 to 21 years across standard classroom sessions. We fine-tuned state-of-the-art foundation models for multimodal wearable time-series analysis and show that challenging behavior episodes can be predicted up to 10 minutes in advance with an AUC-ROC of 0.78. These results establish a concrete foundation for developing proactive in-class intervention systems that enable teachers to minimize the safety risks of challenging behaviors in special education classrooms

5.1SPDec 9, 2022

ProductGraphSleepNet: Sleep Staging using Product Spatio-Temporal Graph Learning with Attentive Temporal Aggregation

Aref Einizade, Samaneh Nasiri, Sepideh Hajipour Sardouie et al.

The classification of sleep stages plays a crucial role in understanding and diagnosing sleep pathophysiology. Sleep stage scoring relies heavily on visual inspection by an expert that is time consuming and subjective procedure. Recently, deep learning neural network approaches have been leveraged to develop a generalized automated sleep staging and account for shifts in distributions that may be caused by inherent inter/intra-subject variability, heterogeneity across datasets, and different recording environments. However, these networks ignore the connections among brain regions, and disregard the sequential connections between temporally adjacent sleep epochs. To address these issues, this work proposes an adaptive product graph learning-based graph convolutional network, named ProductGraphSleepNet, for learning joint spatio-temporal graphs along with a bidirectional gated recurrent unit and a modified graph attention network to capture the attentive dynamics of sleep stage transitions. Evaluation on two public databases: the Montreal Archive of Sleep Studies (MASS) SS3; and the SleepEDF, which contain full night polysomnography recordings of 62 and 20 healthy subjects, respectively, demonstrates performance comparable to the state-of-the-art (Accuracy: 0.867;0.838, F1-score: 0.818;0.774 and Kappa: 0.802;0.775, on each database respectively). More importantly, the proposed network makes it possible for clinicians to comprehend and interpret the learned connectivity graphs for sleep stages.

2.3SPNov 15, 2023

Point-of-Care Real-Time Signal Quality for Fetal Doppler Ultrasound Using a Deep Learning Approach

Mohsen Motie-Shirazi, Reza Sameni, Peter Rohloff et al.

In this study, we present a deep learning framework designed to integrate with our previously developed system that facilitates large-scale 1D fetal Doppler data collection, aiming to enhance data quality. This system, tailored for traditional Indigenous midwives in low-resource communities, leverages a cost-effective Android phone to improve the quality of recorded signals. We have shown that the Doppler data can be used to identify fetal growth restriction, hypertension, and other concerning issues during pregnancy. However, the quality of the signal is dependent on many factors, including radio frequency interference, position of the fetus, maternal body habitus, and usage of the Doppler by the birth attendants. In order to provide instant feedback to allow correction of the data at source, a signal quality metric is required that can run in real-time on the mobile phone. In this study, 191 DUS signals with durations mainly in the range between 5 to 10 minutes were evaluated for quality and classified into five categories: Good, Poor, (Radiofrequency) Interference, Talking, and Silent, at a resolution of 3.75 seconds. A deep neural network was trained on each 3.75-second segment from these recordings and validated using five-fold cross-validation. An average micro F1 = 97.4\% and macro F1 = 94.2\% were achieved, with F1 = 99.2\% for `Good' quality data. These results indicate that the algorithm, which will now be implemented in the midwives' app, should allow a significant increase in the quality of data at the time of capture.

2.0CVJul 31, 2024

Explainable Artificial Intelligence for Quantifying Interfering and High-Risk Behaviors in Autism Spectrum Disorder in a Real-World Classroom Environment Using Privacy-Preserving Video Analysis

Barun Das, Conor Anderson, Tania Villavicencio et al.

Rapid identification and accurate documentation of interfering and high-risk behaviors in ASD, such as aggression, self-injury, disruption, and restricted repetitive behaviors, are important in daily classroom environments for tracking intervention effectiveness and allocating appropriate resources to manage care needs. However, having a staff dedicated solely to observing is costly and uncommon in most educational settings. Recently, multiple research studies have explored developing automated, continuous, and objective tools using machine learning models to quantify behaviors in ASD. However, the majority of the work was conducted under a controlled environment and has not been validated for real-world conditions. In this work, we demonstrate that the latest advances in video-based group activity recognition techniques can quantify behaviors in ASD in real-world activities in classroom environments while preserving privacy. Our explainable model could detect the episode of problem behaviors with a 77% F1-score and capture distinctive behavior features in different types of behaviors in ASD. To the best of our knowledge, this is the first work that shows the promise of objectively quantifying behaviors in ASD in a real-world environment, which is an important step toward the development of a practical tool that can ease the burden of data collection for classroom staff.

4.2AIAug 19, 2024

Feasibility of assessing cognitive impairment via distributed camera network and privacy-preserving edge computing

Chaitra Hegde, Yashar Kiarashi, Allan I Levey et al.

INTRODUCTION: Mild cognitive impairment (MCI) is characterized by a decline in cognitive functions beyond typical age and education-related expectations. Since, MCI has been linked to reduced social interactions and increased aimless movements, we aimed to automate the capture of these behaviors to enhance longitudinal monitoring. METHODS: Using a privacy-preserving distributed camera network, we collected movement and social interaction data from groups of individuals with MCI undergoing therapy within a 1700$m^2$ space. We developed movement and social interaction features, which were then used to train a series of machine learning algorithms to distinguish between higher and lower cognitive functioning MCI groups. RESULTS: A Wilcoxon rank-sum test revealed statistically significant differences between high and low-functioning cohorts in features such as linear path length, walking speed, change in direction while walking, entropy of velocity and direction change, and number of group formations in the indoor space. Despite lacking individual identifiers to associate with specific levels of MCI, a machine learning approach using the most significant features provided a 71% accuracy. DISCUSSION: We provide evidence to show that a privacy-preserving low-cost camera network using edge computing framework has the potential to distinguish between different levels of cognitive impairment from the movements and social interactions captured during group activities.

14.4LGOct 2, 2025

Detection of Chagas Disease from the ECG: The George B. Moody PhysioNet Challenge 2025

Matthew A. Reyna, Zuzana Koscova, Jan Pavlus et al.

Objective: Chagas disease is a parasitic infection that is endemic to South America, Central America, and, more recently, the U.S., primarily transmitted by insects. Chronic Chagas disease can cause cardiovascular diseases and digestive problems. Serological testing capacities for Chagas disease are limited, but Chagas cardiomyopathy often manifests in ECGs, providing an opportunity to prioritize patients for testing and treatment. Approach: The George B. Moody PhysioNet Challenge 2025 invites teams to develop algorithmic approaches for identifying Chagas disease from electrocardiograms (ECGs). Main results: This Challenge provides multiple innovations. First, we leveraged several datasets with labels from patient reports and serological testing, provided a large dataset with weak labels and smaller datasets with strong labels. Second, we augmented the data to support model robustness and generalizability to unseen data sources. Third, we applied an evaluation metric that captured the local serological testing capacity for Chagas disease to frame the machine learning problem as a triage task. Significance: Over 630 participants from 111 teams submitted over 1300 entries during the Challenge, representing diverse approaches from academia and industry worldwide.

2.3MED-PHOct 25, 2024

Electromechanical Dynamics of the Heart: A Study of Cardiac Hysteresis During Physical Stress Test

Sajjad Karimi, Shirin Karimi, Amit J. Shah et al.

Cardiovascular diseases are best diagnosed using multiple modalities that assess both the heart's electrical and mechanical functions. While effective, imaging techniques like echocardiography and nuclear imaging are costly and not widely accessible. More affordable technologies, such as simultaneous electrocardiography (ECG) and phonocardiography (PCG), may provide valuable insights into electromechanical coupling and could be useful for prescreening in low-resource settings. Using physical stress test data from the EPHNOGRAM ECG-PCG dataset, collected from 23 healthy male subjects (age: 25.4+/-1.9 yrs), we investigated electromechanical intervals (RR, QT, systolic, and diastolic) and their interactions during exercise, along with hysteresis between cardiac electrical activity and mechanical responses. Time delay analysis revealed distinct temporal relationships between QT, systolic, and diastolic intervals, with RR as the primary driver. The diastolic interval showed near-synchrony with RR, while QT responded to RR interval changes with an average delay of 10.5s, and the systolic interval responded more slowly, with an average delay of 28.3s. We examined QT-RR, systolic-RR, and diastolic-RR hysteresis, finding narrower loops for diastolic RR and wider loops for systolic RR. Significant correlations (average:0.75) were found between heart rate changes and hysteresis loop areas, suggesting the equivalent circular area diameter as a promising biomarker for cardiac function under exercise stress. Deep learning models, including Long Short-Term Memory and Convolutional Neural Networks, estimated the QT, systolic, and diastolic intervals from RR data, confirming the nonlinear relationship between RR and other intervals. Findings highlight a significant cardiac memory effect, linking ECG and PCG morphology and timing to heart rate history.

9.4LGMar 12, 2025

Edge AI for Real-time Fetal Assessment in Rural Guatemala

Nasim Katebi, Mohammad Ahmad, Mohsen Motie-Shirazi et al.

Perinatal complications, defined as conditions that arise during pregnancy, childbirth, and the immediate postpartum period, represent a significant burden on maternal and neonatal health worldwide. Factors contributing to these disparities include limited access to quality healthcare, socioeconomic inequalities, and variations in healthcare infrastructure. Addressing these issues is crucial for improving health outcomes for mothers and newborns, particularly in underserved communities. To mitigate these challenges, we have developed an AI-enabled smartphone application designed to provide decision support at the point-of-care. This tool aims to enhance health monitoring during pregnancy by leveraging machine learning (ML) techniques. The intended use of this application is to assist midwives during routine home visits by offering real-time analysis and providing feedback based on collected data. The application integrates TensorFlow Lite (TFLite) and other Python-based algorithms within a Kotlin framework to process data in real-time. It is designed for use in low-resource settings, where traditional healthcare infrastructure may be lacking. The intended patient population includes pregnant women and new mothers in underserved areas and the developed system was piloted in rural Guatemala. This ML-based solution addresses the critical need for accessible and quality perinatal care by empowering healthcare providers with decision support tools to improve maternal and neonatal health outcomes.

4.9HCDec 12, 2024

Feasibility of Detecting Cognitive Impairment and Psychological Well-being among Older Adults Using Facial, Acoustic, Linguistic, and Cardiovascular Patterns Derived from Remote Conversations

Xiaofan Mu, Merna Bibars, Salman Seyedi et al.

The aging society urgently requires scalable methods to monitor cognitive decline and identify social and psychological factors indicative of dementia risk in older adults. Our machine learning (ML) models captured facial, acoustic, linguistic, and cardiovascular features from 39 older adults with normal cognition or Mild Cognitive Impairment (MCI), derived from remote video conversations and quantified their cognitive status, social isolation, neuroticism, and psychological well-being. Our model could distinguish Clinical Dementia Rating Scale (CDR) of 0.5 (vs. 0) with 0.77 area under the receiver operating characteristic curve (AUC), social isolation with 0.74 AUC, social satisfaction with 0.75 AUC, psychological well-being with 0.72 AUC, and negative affect with 0.74 AUC. Our feature importance analysis showed that speech and language patterns were useful for quantifying cognitive impairment, whereas facial expressions and cardiovascular patterns were useful for quantifying social and psychological well-being. Our bias analysis showed that the best-performing models for quantifying psychological well-being and cognitive states in older adults exhibited significant biases concerning their age, sex, disease condition, and education levels. Our comprehensive analysis shows the feasibility of monitoring the cognitive and psychological health of older adults, as well as the need for collecting largescale interview datasets of older adults to benefit from the latest advances in deep learning technologies to develop generalizable models across older adults with diverse demographic backgrounds and disease conditions.

8.4CVOct 5, 2025

CARE-PD: A Multi-Site Anonymized Clinical Dataset for Parkinson's Disease Gait Assessment

Vida Adeli, Ivan Klabucar, Javad Rajabi et al.

Objective gait assessment in Parkinson's Disease (PD) is limited by the absence of large, diverse, and clinically annotated motion datasets. We introduce CARE-PD, the largest publicly available archive of 3D mesh gait data for PD, and the first multi-site collection spanning 9 cohorts from 8 clinical centers. All recordings (RGB video or motion capture) are converted into anonymized SMPL meshes via a harmonized preprocessing pipeline. CARE-PD supports two key benchmarks: supervised clinical score prediction (estimating Unified Parkinson's Disease Rating Scale, UPDRS, gait scores) and unsupervised motion pretext tasks (2D-to-3D keypoint lifting and full-body 3D reconstruction). Clinical prediction is evaluated under four generalization protocols: within-dataset, cross-dataset, leave-one-dataset-out, and multi-dataset in-domain adaptation. To assess clinical relevance, we compare state-of-the-art motion encoders with a traditional gait-feature baseline, finding that encoders consistently outperform handcrafted features. Pretraining on CARE-PD reduces MPJPE (from 60.8mm to 7.5mm) and boosts PD severity macro-F1 by 17 percentage points, underscoring the value of clinically curated, diverse training data. CARE-PD and all benchmark code are released for non-commercial research at https://neurips2025.care-pd.ca/.

1.2SPSep 12, 2025

FetalSleepNet: A Transfer Learning Framework with Spectral Equalisation Domain Adaptation for Fetal Sleep Stage Classification

Weitao Tang, Johann Vargas-Calixto, Nasim Katebi et al.

Introduction: This study presents FetalSleepNet, the first published deep learning approach to classifying sleep states from the ovine electroencephalogram (EEG). Fetal EEG is complex to acquire and difficult and laborious to interpret consistently. However, accurate sleep stage classification may aid in the early detection of abnormal brain maturation associated with pregnancy complications (e.g. hypoxia or intrauterine growth restriction). Methods: EEG electrodes were secured onto the ovine dura over the parietal cortices of 24 late gestation fetal sheep. A lightweight deep neural network originally developed for adult EEG sleep staging was trained on the ovine EEG using transfer learning from adult EEG. A spectral equalisation-based domain adaptation strategy was used to reduce cross-domain mismatch. Results: We demonstrated that while direct transfer performed poorly, full fine tuning combined with spectral equalisation achieved the best overall performance (accuracy: 86.6 percent, macro F1-score: 62.5), outperforming baseline models. Conclusions: To the best of our knowledge, FetalSleepNet is the first deep learning framework specifically developed for automated sleep staging from the fetal EEG. Beyond the laboratory, the EEG-based sleep stage classifier functions as a label engine, enabling large scale weak/semi supervised labeling and distillation to facilitate training on less invasive signals that can be acquired in the clinic, such as Doppler Ultrasound or electrocardiogram data. FetalSleepNet's lightweight design makes it well suited for deployment in low power, real time, and wearable fetal monitoring systems.

4.1LGJun 11, 2025

Cross-Learning Between ECG and PCG: Exploring Common and Exclusive Characteristics of Bimodal Electromechanical Cardiac Waveforms

Sajjad Karimi, Amit J. Shah, Gari D. Clifford et al.

Simultaneous electrocardiography (ECG) and phonocardiogram (PCG) provide a comprehensive, multimodal perspective on cardiac function by capturing the heart's electrical and mechanical activities, respectively. However, the distinct and overlapping information content of these signals, as well as their potential for mutual reconstruction and biomarker extraction, remains incompletely understood, especially under varying physiological conditions and across individuals. In this study, we systematically investigate the common and exclusive characteristics of ECG and PCG using the EPHNOGRAM dataset of simultaneous ECG-PCG recordings during rest and exercise. We employ a suite of linear and nonlinear machine learning models, including non-causal LSTM networks, to reconstruct each modality from the other and analyze the influence of causality, physiological state, and cross-subject variability. Our results demonstrate that nonlinear models, particularly non-causal LSTM, provide superior reconstruction performance, with reconstructing ECG from PCG proving more tractable than the reverse. Exercise and cross-subject scenarios present significant challenges, but envelope-based modeling that utilizes instantaneous amplitude features substantially improves cross-subject generalizability for cross-modal learning. Furthermore, we demonstrate that clinically relevant ECG biomarkers, such as fiducial points and QT intervals, can be estimated from PCG in cross-subject settings. These findings advance our understanding of the relationship between electromechanical cardiac modalities, in terms of both waveform characteristics and the timing of cardiac events, with potential applications in novel multimodal cardiac monitoring technologies.

2.6LGApr 16, 2024

Benchmarking changepoint detection algorithms on cardiac time series

Ayse Cakmak, Erik Reinertsen, Shamim Nemati et al.

The pattern of state changes in a biomedical time series can be related to health or disease. This work presents a principled approach for selecting a changepoint detection algorithm for a specific task, such as disease classification. Eight key algorithms were compared, and the performance of each algorithm was evaluated as a function of temporal tolerance, noise, and abnormal conduction (ectopy) on realistic artificial cardiovascular time series data. All algorithms were applied to real data (cardiac time series of 22 patients with REM-behavior disorder (RBD) and 15 healthy controls) using the parameters selected on artificial data. Finally, features were derived from the detected changepoints to classify RBD patients from healthy controls using a K-Nearest Neighbors approach. On artificial data, Modified Bayesian Changepoint Detection algorithm provided superior positive predictive value for state change identification while Recursive Mean Difference Maximization (RMDM) achieved the highest true positive rate. For the classification task, features derived from the RMDM algorithm provided the highest leave one out cross validated accuracy of 0.89 and true positive rate of 0.87. Automatically detected changepoints provide useful information about subject's physiological state which cannot be directly observed. However, the choice of change point detection algorithm depends on the nature of the underlying data and the downstream application, such as a classification task. This work represents the first time change point detection algorithms have been compared in a meaningful way and utilized in a classification task, which demonstrates the effect of changepoint algorithm choice on application performance.

3.9CVMay 8, 2023Code

A Feasibility Study on Indoor Localization and Multi-person Tracking Using Sparsely Distributed Camera Network with Edge Computing

Hyeokhyen Kwon, Chaitra Hegde, Yashar Kiarashi et al.

Camera-based activity monitoring systems are becoming an attractive solution for smart building applications with the advances in computer vision and edge computing technologies. In this paper, we present a feasibility study and systematic analysis of a camera-based indoor localization and multi-person tracking system implemented on edge computing devices within a large indoor space. To this end, we deployed an end-to-end edge computing pipeline that utilizes multiple cameras to achieve localization, body orientation estimation and tracking of multiple individuals within a large therapeutic space spanning $1700m^2$, all while maintaining a strong focus on preserving privacy. Our pipeline consists of 39 edge computing camera systems equipped with Tensor Processing Units (TPUs) placed in the indoor space's ceiling. To ensure the privacy of individuals, a real-time multi-person pose estimation algorithm runs on the TPU of the computing camera system. This algorithm extracts poses and bounding boxes, which are utilized for indoor localization, body orientation estimation, and multi-person tracking. Our pipeline demonstrated an average localization error of 1.41 meters, a multiple-object tracking accuracy score of 88.6\%, and a mean absolute body orientation error of 29\degree. These results shows that localization and tracking of individuals in a large indoor space is feasible even with the privacy constrains.

3.1LGDec 28, 2021

Mythological Medical Machine Learning: Boosting the Performance of a Deep Learning Medical Data Classifier Using Realistic Physiological Models

Ismail Sadiq, Erick A. Perez-Alday, Amit J. Shah et al.

Objective: To determine if a realistic, but computationally efficient model of the electrocardiogram can be used to pre-train a deep neural network (DNN) with a wide range of morphologies and abnormalities specific to a given condition - T-wave Alternans (TWA) as a result of Post-Traumatic Stress Disorder, or PTSD - and significantly boost performance on a small database of rare individuals. Approach: Using a previously validated artificial ECG model, we generated 180,000 artificial ECGs with or without significant TWA, with varying heart rate, breathing rate, TWA amplitude, and ECG morphology. A DNN, trained on over 70,000 patients to classify 25 different rhythms, was modified the output layer to a binary class (TWA or no-TWA, or equivalently, PTSD or no-PTSD), and transfer learning was performed on the artificial ECG. In a final transfer learning step, the DNN was trained and cross-validated on ECG from 12 PTSD and 24 controls for all combinations of using the three databases. Main results: The best performing approach (AUROC = 0.77, Accuracy = 0.72, F1-score = 0.64) was found by performing both transfer learning steps, using the pre-trained arrhythmia DNN, the artificial data and the real PTSD-related ECG data. Removing the artificial data from training led to the largest drop in performance. Removing the arrhythmia data from training provided a modest, but significant, drop in performance. The final model showed no significant drop in performance on the artificial data, indicating no overfitting. Significance: In healthcare, it is common to only have a small collection of high-quality data and labels, or a larger database with much lower quality (and less relevant) labels. The paradigm presented here, involving model-based performance boosting, provides a solution through transfer learning on a large realistic artificial database, and a partially relevant real database.

15.2QMAug 2, 2021

The CirCor DigiScope Dataset: From Murmur Detection to Murmur Classification

Jorge Oliveira, Francesco Renna, Paulo Dias Costa et al.

Cardiac auscultation is one of the most cost-effective techniques used to detect and identify many heart conditions. Computer-assisted decision systems based on auscultation can support physicians in their decisions. Unfortunately, the application of such systems in clinical trials is still minimal since most of them only aim to detect the presence of extra or abnormal waves in the phonocardiogram signal, i.e., only a binary ground truth variable (normal vs abnormal) is provided. This is mainly due to the lack of large publicly available datasets, where a more detailed description of such abnormal waves (e.g., cardiac murmurs) exists. To pave the way to more effective research on healthcare recommendation systems based on auscultation, our team has prepared the currently largest pediatric heart sound dataset. A total of 5282 recordings have been collected from the four main auscultation locations of 1568 patients, in the process, 215780 heart sounds have been manually annotated. Furthermore, and for the first time, each cardiac murmur has been manually annotated by an expert annotator according to its timing, shape, pitch, grading, and quality. In addition, the auscultation locations where the murmur is present were identified as well as the auscultation location where the murmur is detected more intensively. Such detailed description for a relatively large number of heart sounds may pave the way for new machine learning algorithms with a real-world application for the detection and analysis of murmur waves for diagnostic purposes.

1.4CVJun 17, 2021

Privacy-Preserving Eye-tracking Using Deep Learning

Salman Seyedi, Zifan Jiang, Allan Levey et al.

The expanding usage of complex machine learning methods like deep learning has led to an explosion in human activity recognition, particularly applied to health. In particular, as part of a larger body sensor network system, face and full-body analysis is becoming increasingly common for evaluating health status. However, complex models which handle private and sometimes protected data, raise concerns about the potential leak of identifiable data. In this work, we focus on the case of a deep network model trained on images of individual faces. Full-face video recordings taken from 493 individuals undergoing an eye-tracking based evaluation of neurological function were used. Outputs, gradients, intermediate layer outputs, loss, and labels were used as inputs for a deep network with an added support vector machine emission layer to recognize membership in the training data. The inference attack method and associated mathematical analysis indicate that there is a low likelihood of unintended memorization of facial features in the deep learning model. In this study, it is showed that the named model preserves the integrity of training data with reasonable confidence. The same process can be implemented in similar conditions for different models.

1.2SPApr 4, 2021

Late fusion of machine learning models using passively captured interpersonal social interactions and motion from smartphones predicts decompensation in heart failure

Ayse S. Cakmak, Samuel Densen, Gabriel Najarro et al.

Objective: Worldwide, heart failure (HF) is a major cause of morbidity and mortality and one of the leading causes of hospitalization. Early detection of HF symptoms and pro-active management may reduce adverse events. Approach: Twenty-eight participants were monitored using a smartphone app after discharge from hospitals, and each clinical event during the enrollment (N=110 clinical events) was recorded. Motion, social, location, and clinical survey data collected via the smartphone-based monitoring system were used to develop and validate an algorithm for predicting or classifying HF decompensation events (hospitalizations or clinic visit) versus clinic monitoring visits in which they were determined to be compensated or stable. Models based on single modality as well as early and late fusion approaches combining patient-reported outcomes and passive smartphone data were evaluated. Results: The highest AUCPr for classifying decompensation with a late fusion approach was 0.80 using leave one subject out cross-validation. Significance: Passively collected data from smartphones, especially when combined with weekly patient-reported outcomes, may reflect behavioral and physiological changes due to HF and thus could enable prediction of HF decompensation.

4.4LGJan 28, 2021

An Analysis Of Protected Health Information Leakage In Deep-Learning Based De-Identification Algorithms

Salman Seyedi, Li Xiong, Shamim Nemati et al.

The increasing complexity of algorithms for analyzing medical data, including de-identification tasks, raises the possibility that complex algorithms are learning not just the general representation of the problem, but specifics of given individuals within the data. Modern legal frameworks specifically prohibit the intentional or accidental distribution of patient data, but have not addressed this potential avenue for leakage of such protected health information. Modern deep learning algorithms have the highest potential of such leakage due to complexity of the models. Recent research in the field has highlighted such issues in non-medical data, but all analysis is likely to be data and algorithm specific. We, therefore, chose to analyze a state-of-the-art free-text de-identification algorithm based on LSTM (Long Short-Term Memory) and its potential in encoding any individual in the training set. Using the i2b2 Challenge Data, we trained, then analyzed the model to assess whether the output of the LSTM, before the compression layer of the classifier, could be used to estimate the membership of the training data. Furthermore, we used different attacks including membership inference attack method to attack the model. Results indicate that the attacks could not identify whether members of the training data were distinguishable from non-members based on the model output. This indicates that the model does not provide any strong evidence into the identification of the individuals in the training data set and there is not yet empirical evidence it is unsafe to distribute the model for general use.

2.3LGDec 20, 2020

Voting of predictive models for clinical outcomes: consensus of algorithms for the early prediction of sepsis from clinical data and an analysis of the PhysioNet/Computing in Cardiology Challenge 2019

Matthew A. Reyna, Gari D. Clifford

Although there has been significant research in boosting of weak learners, there has been little work in the field of boosting from strong learners. This latter paradigm is a form of weighted voting with learned weights. In this work, we consider the problem of constructing an ensemble algorithm from 70 individual algorithms for the early prediction of sepsis from clinical data. We find that this ensemble algorithm outperforms separate algorithms, especially on a hidden test set on which most algorithms failed to generalize.

2.3LGNov 14, 2020

Using Convolutional Variational Autoencoders to Predict Post-Trauma Health Outcomes from Actigraphy Data

Ayse S. Cakmak, Nina Thigpen, Garrett Honke et al.

Depression and post-traumatic stress disorder (PTSD) are psychiatric conditions commonly associated with experiencing a traumatic event. Estimating mental health status through non-invasive techniques such as activity-based algorithms can help to identify successful early interventions. In this work, we used locomotor activity captured from 1113 individuals who wore a research grade smartwatch post-trauma. A convolutional variational autoencoder (VAE) architecture was used for unsupervised feature extraction from four weeks of actigraphy data. By using VAE latent variables and the participant's pre-trauma physical health status as features, a logistic regression classifier achieved an area under the receiver operating characteristic curve (AUC) of 0.64 to estimate mental health outcomes. The results indicate that the VAE model is a promising approach for actigraphy data analysis for mental health outcomes in long-term studies.

3.3SPMay 9, 2020

Temporal-Framing Adaptive Network for Heart Sound Segmentation without Prior Knowledge of State Duration

Xingyao Wang, Chengyu Liu, Yuwen Li et al.

Objective: This paper presents a novel heart sound segmentation algorithm based on Temporal-Framing Adaptive Network (TFAN), including state transition loss and dynamic inference for decoding the most likely state sequence. Methods: In contrast to previous state-of-the-art approaches, the TFAN-based method does not require any knowledge of the state duration of heart sounds and is therefore likely to generalize to non sinus rhythm. The TFAN-based method was trained on 50 recordings randomly chosen from Training set A of the 2016 PhysioNet/Computer in Cardiology Challenge and tested on the other 12 independent training and test databases (2099 recordings and 52180 beats). The databases for segmentation were separated into three levels of increasing difficulty (LEVEL-I, -II and -III) for performance reporting. Results: The TFAN-based method achieved a superior F1 score for all 12 databases except for `Test-B', with an average of 96.7%, compared to 94.6% for the state-of-the-art method. Moreover, the TFAN-based method achieved an overall F1 score of 99.2%, 94.4%, 91.4% on LEVEL-I, -II and -III data respectively, compared to 98.4%, 88.54% and 79.80% for the current state-of-the-art method. Conclusion: The TFAN-based method therefore provides a substantial improvement, particularly for more difficult cases, and on data sets not represented in the public training data. Significance: The proposed method is highly flexible and likely to apply to other non-stationary time series. Further work is required to understand to what extent this approach will provide improved diagnostic performance, although it is logical to assume superior segmentation will lead to improved diagnostics.

9.7SPJun 20, 2018

Addressing Class Imbalance in Classification Problems of Noisy Signals by using Fourier Transform Surrogates

Justus T. C. Schwabedal, John C. Snyder, Ayse Cakmak et al.

Randomizing the Fourier-transform (FT) phases of temporal-spatial data generates surrogates that approximate examples from the data-generating distribution. We propose such FT surrogates as a novel tool to augment and analyze training of neural networks and explore the approach in the example of sleep-stage classification. By computing FT surrogates of raw EEG, EOG, and EMG signals of under-represented sleep stages, we balanced the CAPSLPDB sleep database. We then trained and tested a convolutional neural network for sleep stage classification, and found that our surrogate-based augmentation improved the mean F1-score by 7%. As another application of FT surrogates, we formulated an approach to compute saliency maps for individual sleep epochs. The visualization is based on the response of inferred class probabilities under replacement of short data segments by partial surrogates. To quantify how well the distributions of the surrogates and the original data match, we evaluated a trained classifier on surrogates of correctly classified examples, and summarized these conditional predictions in a confusion matrix. We show how such conditional confusion matrices can qualitatively explain the performance of surrogates in class balancing. The FT-surrogate augmentation approach may improve classification on noisy signals if carefully adapted to the data distribution under analysis.

12.6NCMay 7, 2018

Detection of Paroxysmal Atrial Fibrillation using Attention-based Bidirectional Recurrent Neural Networks

Supreeth P. Shashikumar, Amit J. Shah, Gari D. Clifford et al.

Detection of atrial fibrillation (AF), a type of cardiac arrhythmia, is difficult since many cases of AF are usually clinically silent and undiagnosed. In particular paroxysmal AF is a form of AF that occurs occasionally, and has a higher probability of being undetected. In this work, we present an attention based deep learning framework for detection of paroxysmal AF episodes from a sequence of windows. Time-frequency representation of 30 seconds recording windows, over a 10 minute data segment, are fed sequentially into a deep convolutional neural network for image-based feature extraction, which are then presented to a bidirectional recurrent neural network with an attention layer for AF detection. To demonstrate the effectiveness of the proposed framework for transient AF detection, we use a database of 24 hour Holter Electrocardiogram (ECG) recordings acquired from 2850 patients at the University of Virginia heart station. The algorithm achieves an AUC of 0.94 on the testing set, which exceeds the performance of baseline models. We also demonstrate the cross-domain generalizablity of the approach by adapting the learned model parameters from one recording modality (ECG) to another (photoplethysmogram) with improved AF detection performance. The proposed high accuracy, low false alarm algorithm for detecting paroxysmal AF has potential applications in long-term monitoring using wearable sensors.

2.0LGDec 1, 2017

Subject Selection on a Riemannian Manifold for Unsupervised Cross-subject Seizure Detection

Samaneh Nasiri Ghosheh Bolagh, Gari. D. Clifford

Inter-subject variability between individuals poses a challenge in inter-subject brain signal analysis problems. A new algorithm for subject-selection based on clustering covariance matrices on a Riemannian manifold is proposed. After unsupervised selection of the subsets of relevant subjects, data in a cluster is mapped to a tangent space at the mean point of covariance matrices in that cluster and an SVM classifier on labeled data from relevant subjects is trained. Experiment on an EEG seizure database shows that the proposed method increases the accuracy over state-of-the-art from 86.83% to 89.84% and specificity from 87.38% to 89.64% while reducing the false positive rate/hour from 0.8/hour to 0.77/hour.

2.1LGMar 23, 2015

Fusing Continuous-valued Medical Labels using a Bayesian Model

Tingting Zhu, Nic Dunkley, Joachim Behar et al.

With the rapid increase in volume of time series medical data available through wearable devices, there is a need to employ automated algorithms to label data. Examples of labels include interventions, changes in activity (e.g. sleep) and changes in physiology (e.g. arrhythmias). However, automated algorithms tend to be unreliable resulting in lower quality care. Expert annotations are scarce, expensive, and prone to significant inter- and intra-observer variance. To address these problems, a Bayesian Continuous-valued Label Aggregator(BCLA) is proposed to provide a reliable estimation of label aggregation while accurately infer the precision and bias of each algorithm. The BCLA was applied to QT interval (pro-arrhythmic indicator) estimation from the electrocardiogram using labels from the 2006 PhysioNet/Computing in Cardiology Challenge database. It was compared to the mean, median, and a previously proposed Expectation Maximization (EM) label aggregation approaches. While accurately predicting each labelling algorithm's bias and precision, the root-mean-square error of the BCLA was 11.78$\pm$0.63ms, significantly outperforming the best Challenge entry (15.37$\pm$2.13ms) as well as the EM, mean, and median voting strategies (14.76$\pm$0.52ms, 17.61$\pm$0.55ms, and 14.43$\pm$0.57ms respectively with $p<0.0001$).