Fani Deligianni

CV
h-index28
24papers
486citations
Novelty42%
AI Score47

24 Papers

CVOct 14, 2022Code
Optimizing Vision Transformers for Medical Image Segmentation

Qianying Liu, Chaitanya Kaul, Jun Wang et al.

For medical image semantic segmentation (MISS), Vision Transformers have emerged as strong alternatives to convolutional neural networks thanks to their inherent ability to capture long-range correlations. However, existing research uses off-the-shelf vision Transformer blocks based on linear projections and feature processing which lack spatial and local context to refine organ boundaries. Furthermore, Transformers do not generalize well on small medical imaging datasets and rely on large-scale pre-training due to limited inductive biases. To address these problems, we demonstrate the design of a compact and accurate Transformer network for MISS, CS-Unet, which introduces convolutions in a multi-stage design for hierarchically enhancing spatial and local modeling ability of Transformers. This is mainly achieved by our well-designed Convolutional Swin Transformer (CST) block which merges convolutions with Multi-Head Self-Attention and Feed-Forward Networks for providing inherent localized spatial context and inductive biases. Experiments demonstrate CS-Unet without pre-training outperforms other counterparts by large margins on multi-organ and cardiac datasets with fewer parameters and achieves state-of-the-art performance. Our code is available at Github.

CVSep 16, 2024Code
Learning Semi-Supervised Medical Image Segmentation from Spatial Registration

Qianying Liu, Paul Henderson, Xiao Gu et al.

Semi-supervised medical image segmentation has shown promise in training models with limited labeled data and abundant unlabeled data. However, state-of-the-art methods ignore a potentially valuable source of unsupervised semantic information -- spatial registration transforms between image volumes. To address this, we propose CCT-R, a contrastive cross-teaching framework incorporating registration information. To leverage the semantic information available in registrations between volume pairs, CCT-R incorporates two proposed modules: Registration Supervision Loss (RSL) and Registration-Enhanced Positive Sampling (REPS). The RSL leverages segmentation knowledge derived from transforms between labeled and unlabeled volume pairs, providing an additional source of pseudo-labels. REPS enhances contrastive learning by identifying anatomically-corresponding positives across volumes using registration transforms. Experimental results on two challenging medical segmentation benchmarks demonstrate the effectiveness and superiority of CCT-R across various semi-supervised settings, with as few as one labeled case. Our code is available at https://github.com/kathyliu579/ContrastiveCross-teachingWithRegistration.

CVJun 25, 2023
Multi-Scale Cross Contrastive Learning for Semi-Supervised Medical Image Segmentation

Qianying Liu, Xiao Gu, Paul Henderson et al. · oxford

Semi-supervised learning has demonstrated great potential in medical image segmentation by utilizing knowledge from unlabeled data. However, most existing approaches do not explicitly capture high-level semantic relations between distant regions, which limits their performance. In this paper, we focus on representation learning for semi-supervised learning, by developing a novel Multi-Scale Cross Supervised Contrastive Learning (MCSC) framework, to segment structures in medical images. We jointly train CNN and Transformer models, regularising their features to be semantically consistent across different scales. Our approach contrasts multi-scale features based on ground-truth and cross-predicted labels, in order to extract robust feature representations that reflect intra- and inter-slice relationships across the whole dataset. To tackle class imbalance, we take into account the prevalence of each class to guide contrastive learning and ensure that features adequately capture infrequent classes. Extensive experiments on two multi-structure medical segmentation datasets demonstrate the effectiveness of MCSC. It not only outperforms state-of-the-art semi-supervised methods by more than 3.0% in Dice, but also greatly reduces the performance gap with fully supervised methods.

CVOct 9, 2023
Controllable Chest X-Ray Report Generation from Longitudinal Representations

Francesco Dalla Serra, Chaoyang Wang, Fani Deligianni et al.

Radiology reports are detailed text descriptions of the content of medical scans. Each report describes the presence/absence and location of relevant clinical findings, commonly including comparison with prior exams of the same patient to describe how they evolved. Radiology reporting is a time-consuming process, and scan results are often subject to delays. One strategy to speed up reporting is to integrate automated reporting systems, however clinical deployment requires high accuracy and interpretability. Previous approaches to automated radiology reporting generally do not provide the prior study as input, precluding comparison which is required for clinical accuracy in some types of scans, and offer only unreliable methods of interpretability. Therefore, leveraging an existing visual input format of anatomical tokens, we introduce two novel aspects: (1) longitudinal representation learning -- we input the prior scan as an additional input, proposing a method to align, concatenate and fuse the current and prior visual information into a joint longitudinal representation which can be provided to the multimodal report generation model; (2) sentence-anatomy dropout -- a training strategy for controllability in which the report generator model is trained to predict only sentences from the original report which correspond to the subset of anatomical regions given as input. We show through in-depth experiments on the MIMIC-CXR dataset how the proposed approach achieves state-of-the-art results while enabling anatomy-wise controllable report generation.

SPNov 27, 2023
Riemannian Prediction of Anatomical Diagnoses in Congenital Heart Disease based on 12-lead ECGs

Muhammet Alkan, Gruschen Veldtman, Fani Deligianni

Congenital heart disease (CHD) is a relatively rare disease that affects patients at birth and results in extremely heterogeneous anatomical and functional defects. 12-lead ECG signal is routinely collected in CHD patients because it provides significant biomarkers for disease prognosis. However, developing accurate machine learning models is challenging due to the lack of large available datasets. Here, we suggest exploiting the Riemannian geometry of the spatial covariance structure of the ECG signal to improve classification. Firstly, we use covariance augmentation to mix samples across the Riemannian geodesic between corresponding classes. Secondly, we suggest to project the covariance matrices to their respective class Riemannian mean to enhance the quality of feature extraction via tangent space projection. We perform several ablation experiments and demonstrate significant improvement compared to traditional machine learning models and deep learning on ECG time series data.

CVAug 30, 2023
Finding-Aware Anatomical Tokens for Chest X-Ray Automated Reporting

Francesco Dalla Serra, Chaoyang Wang, Fani Deligianni et al.

The task of radiology reporting comprises describing and interpreting the medical findings in radiographic images, including description of their location and appearance. Automated approaches to radiology reporting require the image to be encoded into a suitable token representation for input to the language model. Previous methods commonly use convolutional neural networks to encode an image into a series of image-level feature map representations. However, the generated reports often exhibit realistic style but imperfect accuracy. Inspired by recent works for image captioning in the general domain in which each visual token corresponds to an object detected in an image, we investigate whether using local tokens corresponding to anatomical structures can improve the quality of the generated reports. We introduce a novel adaptation of Faster R-CNN in which finding detection is performed for the candidate bounding boxes extracted during anatomical structure localisation. We use the resulting bounding box feature representations as our set of finding-aware anatomical tokens. This encourages the extracted anatomical tokens to be informative about the findings they contain (required for the final task of radiology reporting). Evaluating on the MIMIC-CXR dataset of chest X-Ray images, we show that task-aware anatomical tokens give state-of-the-art performance when integrated into an automated reporting pipeline, yielding generated reports with improved clinical accuracy.

40.0CVMar 22Code
Privacy-Preserving Federated Action Recognition via Differentially Private Selective Tuning and Efficient Communication

Idris Zakariyya, Pai Chet Ng, Kaushik Bhargav Sivangi et al.

Federated video action recognition enables collaborative model training without sharing raw video data, yet remains vulnerable to two key challenges: \textit{model exposure} and \textit{communication overhead}. Gradients exchanged between clients and the server can leak private motion patterns, while full-model synchronization of high-dimensional video networks causes significant bandwidth and communication costs. To address these issues, we propose \textit{Federated Differential Privacy with Selective Tuning and Efficient Communication for Action Recognition}, namely \textit{FedDP-STECAR}. Our \textit{FedDP-STECAR} framework selectively fine-tunes and perturbs only a small subset of task-relevant layers under Differential Privacy (DP), reducing the surface of information leakage while preserving temporal coherence in video features. By transmitting only the tuned layers during aggregation, communication traffic is reduced by over 99\% compared to full-model updates. Experiments on the UCF-101 dataset using the MViT-B-16x4 transformer show that \textit{FedDP-STECAR} achieves up to \textbf{70.2\% higher accuracy} under strict privacy ($ε=0.65$) in centralized settings and \textbf{48\% faster training} with \textbf{73.1\% accuracy} in federated setups, enabling scalable and privacy-preserving video action recognition. Code available at https://github.com/izakariyya/mvit-federated-videodp

LGAug 30, 2023
Consensus of state of the art mortality prediction models: From all-cause mortality to sudden death prediction

Yola Jones, Fani Deligianni, Jeff Dalton et al.

Worldwide, many millions of people die suddenly and unexpectedly each year, either with or without a prior history of cardiovascular disease. Such events are sparse (once in a lifetime), many victims will not have had prior investigations for cardiac disease and many different definitions of sudden death exist. Accordingly, sudden death is hard to predict. This analysis used NHS Electronic Health Records (EHRs) for people aged $\geq$50 years living in the Greater Glasgow and Clyde (GG\&C) region in 2010 (n = 380,000) to try to overcome these challenges. We investigated whether medical history, blood tests, prescription of medicines, and hospitalisations might, in combination, predict a heightened risk of sudden death. We compared the performance of models trained to predict either sudden death or all-cause mortality. We built six models for each outcome of interest: three taken from state-of-the-art research (BEHRT, Deepr and Deep Patient), and three of our own creation. We trained these using two different data representations: a language-based representation, and a sparse temporal matrix. We used global interpretability to understand the most important features of each model, and compare how much agreement there was amongst models using Rank Biased Overlap. It is challenging to account for correlated variables without increasing the complexity of the interpretability technique. We overcame this by clustering features into groups and comparing the most important groups for each model. We found the agreement between models to be much higher when accounting for correlated variables. Our analysis emphasises the challenge of predicting sudden death and emphasises the need for better understanding and interpretation of machine learning models applied to healthcare applications.

CVApr 25, 2022
Adversarial Attention for Human Motion Synthesis

Matthew Malek-Podjaski, Fani Deligianni

Analysing human motions is a core topic of interest for many disciplines, from Human-Computer Interaction, to entertainment, Virtual Reality and healthcare. Deep learning has achieved impressive results in capturing human pose in real-time. On the other hand, due to high inter-subject variability, human motion analysis models often suffer from not being able to generalise to data from unseen subjects due to very limited specialised datasets available in fields such as healthcare. However, acquiring human motion datasets is highly time-consuming, challenging, and expensive. Hence, human motion synthesis is a crucial research problem within deep learning and computer vision. We present a novel method for controllable human motion synthesis by applying attention-based probabilistic deep adversarial models with end-to-end training. We show that we can generate synthetic human motion over both short- and long-time horizons through the use of adversarial attention. Furthermore, we show that we can improve the classification performance of deep learning models in cases where there is inadequate real data, by supplementing existing datasets with synthetic motions.

LGJan 22
Beat-ssl: Capturing Local ECG Morphology through Heartbeat-level Contrastive Learning with Soft Targets

Muhammad Ilham Rizqyawan, Peter Macfarlane, Stathis Hadjidemetriou et al.

Obtaining labelled ECG data for developing supervised models is challenging. Contrastive learning (CL) has emerged as a promising pretraining approach that enables effective transfer learning with limited labelled data. However, existing CL frameworks either focus solely on global context or fail to exploit ECG-specific characteristics. Furthermore, these methods rely on hard contrastive targets, which may not adequately capture the continuous nature of feature similarity in ECG signals. In this paper, we propose Beat-SSL, a contrastive learning framework that performs dual-context learning through both rhythm-level and heartbeat-level contrasting with soft targets. We evaluated our pretrained model on two downstream tasks: 1) multilabel classification for global rhythm assessment, and 2) ECG segmentation to assess its capacity to learn representations across both contexts. We conducted an ablation study and compared the best configuration with three other methods, including one ECG foundation model. Despite the foundation model's broader pretraining, Beat-SSL reached 93% of its performance in multilabel classification task and surpassed all other methods in the segmentation task by 4%.

CVMar 1, 2024
GLFNET: Global-Local (frequency) Filter Networks for efficient medical image segmentation

Athanasios Tragakis, Qianying Liu, Chaitanya Kaul et al.

We propose a novel transformer-style architecture called Global-Local Filter Network (GLFNet) for medical image segmentation and demonstrate its state-of-the-art performance. We replace the self-attention mechanism with a combination of global-local filter blocks to optimize model efficiency. The global filters extract features from the whole feature map whereas the local filters are being adaptively created as 4x4 patches of the same feature map and add restricted scale information. In particular, the feature extraction takes place in the frequency domain rather than the commonly used spatial (image) domain to facilitate faster computations. The fusion of information from both spatial and frequency spaces creates an efficient model with regards to complexity, required data and performance. We test GLFNet on three benchmark datasets achieving state-of-the-art performance on all of them while being almost twice as efficient in terms of GFLOP operations.

CVNov 4, 2024
Differentially Private Integrated Decision Gradients (IDG-DP) for Radar-based Human Activity Recognition

Idris Zakariyya, Linda Tran, Kaushik Bhargav Sivangi et al.

Human motion analysis offers significant potential for healthcare monitoring and early detection of diseases. The advent of radar-based sensing systems has captured the spotlight for they are able to operate without physical contact and they can integrate with pre-existing Wi-Fi networks. They are also seen as less privacy-invasive compared to camera-based systems. However, recent research has shown high accuracy in recognizing subjects or gender from radar gait patterns, raising privacy concerns. This study addresses these issues by investigating privacy vulnerabilities in radar-based Human Activity Recognition (HAR) systems and proposing a novel method for privacy preservation using Differential Privacy (DP) driven by attributions derived with Integrated Decision Gradient (IDG) algorithm. We investigate Black-box Membership Inference Attack (MIA) Models in HAR settings across various levels of attacker-accessible information. We extensively evaluated the effectiveness of the proposed IDG-DP method by designing a CNN-based HAR model and rigorously assessing its resilience against MIAs. Experimental results demonstrate the potential of IDG-DP in mitigating privacy attacks while maintaining utility across all settings, particularly excelling against label-only and shadow model black-box MIA attacks. This work represents a crucial step towards balancing the need for effective radar-based HAR with robust privacy protection in healthcare environments.

AIJan 16, 2025
Artificial Intelligence-Driven Clinical Decision Support Systems

Muhammet Alkan, Idris Zakariyya, Samuel Leighton et al.

As artificial intelligence (AI) becomes increasingly embedded in healthcare delivery, this chapter explores the critical aspects of developing reliable and ethical Clinical Decision Support Systems (CDSS). Beginning with the fundamental transition from traditional statistical models to sophisticated machine learning approaches, this work examines rigorous validation strategies and performance assessment methods, including the crucial role of model calibration and decision curve analysis. The chapter emphasizes that creating trustworthy AI systems in healthcare requires more than just technical accuracy; it demands careful consideration of fairness, explainability, and privacy. The challenge of ensuring equitable healthcare delivery through AI is stressed, discussing methods to identify and mitigate bias in clinical predictive models. The chapter then delves into explainability as a cornerstone of human-centered CDSS. This focus reflects the understanding that healthcare professionals must not only trust AI recommendations but also comprehend their underlying reasoning. The discussion advances in an analysis of privacy vulnerabilities in medical AI systems, from data leakage in deep learning models to sophisticated attacks against model explanations. The text explores privacy-preservation strategies such as differential privacy and federated learning, while acknowledging the inherent trade-offs between privacy protection and model performance. This progression, from technical validation to ethical considerations, reflects the multifaceted challenges of developing AI systems that can be seamlessly and reliably integrated into daily clinical practice while maintaining the highest standards of patient care and data protection.

LGAug 11, 2025
Cross-Subject and Cross-Montage EEG Transfer Learning via Individual Tangent Space Alignment and Spatial-Riemannian Feature Fusion

Nicole Lai-Tan, Xiao Gu, Marios G. Philiastides et al.

Personalised music-based interventions offer a powerful means of supporting motor rehabilitation by dynamically tailoring auditory stimuli to provide external timekeeping cues, modulate affective states, and stabilise gait patterns. Generalisable Brain-Computer Interfaces (BCIs) thus hold promise for adapting these interventions across individuals. However, inter-subject variability in EEG signals, further compounded by movement-induced artefacts and motor planning differences, hinders the generalisability of BCIs and results in lengthy calibration processes. We propose Individual Tangent Space Alignment (ITSA), a novel pre-alignment strategy incorporating subject-specific recentering, distribution matching, and supervised rotational alignment to enhance cross-subject generalisation. Our hybrid architecture fuses Regularised Common Spatial Patterns (RCSP) with Riemannian geometry in parallel and sequential configurations, improving class separability while maintaining the geometric structure of covariance matrices for robust statistical computation. Using leave-one-subject-out cross-validation, `ITSA' demonstrates significant performance improvements across subjects and conditions. The parallel fusion approach shows the greatest enhancement over its sequential counterpart, with robust performance maintained across varying data conditions and electrode configurations. The code will be made publicly available at the time of publication.

CVMay 22, 2025
Grounding Chest X-Ray Visual Question Answering with Generated Radiology Reports

Francesco Dalla Serra, Patrick Schrempf, Chaoyang Wang et al.

We present a novel approach to Chest X-ray (CXR) Visual Question Answering (VQA), addressing both single-image image-difference questions. Single-image questions focus on abnormalities within a specific CXR ("What abnormalities are seen in image X?"), while image-difference questions compare two longitudinal CXRs acquired at different time points ("What are the differences between image X and Y?"). We further explore how the integration of radiology reports can enhance the performance of VQA models. While previous approaches have demonstrated the utility of radiology reports during the pre-training phase, we extend this idea by showing that the reports can also be leveraged as additional input to improve the VQA model's predicted answers. First, we propose a unified method that handles both types of questions and auto-regressively generates the answers. For single-image questions, the model is provided with a single CXR. For image-difference questions, the model is provided with two CXRs from the same patient, captured at different time points, enabling the model to detect and describe temporal changes. Taking inspiration from 'Chain-of-Thought reasoning', we demonstrate that performance on the CXR VQA task can be improved by grounding the answer generator module with a radiology report predicted for the same CXR. In our approach, the VQA model is divided into two steps: i) Report Generation (RG) and ii) Answer Generation (AG). Our results demonstrate that incorporating predicted radiology reports as evidence to the AG model enhances performance on both single-image and image-difference questions, achieving state-of-the-art results on the Medical-Diff-VQA dataset.

CVApr 14, 2025
Differentially Private 2D Human Pose Estimation

Kaushik Bhargav Sivangi, Paul Henderson, Fani Deligianni

Human pose estimation (HPE) has become essential in numerous applications including healthcare, activity recognition, and human-computer interaction. However, the privacy implications of processing sensitive visual data present significant deployment barriers in critical domains. While traditional anonymization techniques offer limited protection and often compromise data utility for broader motion analysis, Differential Privacy (DP) provides formal privacy guarantees but typically degrades model performance when applied naively. In this work, we present the first comprehensive framework for differentially private 2D human pose estimation (2D-HPE) by applying Differentially Private Stochastic Gradient Descent (DP-SGD) to this task. To effectively balance privacy with performance, we adopt Projected DP-SGD (PDP-SGD), which projects the noisy gradients to a low-dimensional subspace. Next, we incorporate Feature Differential Privacy(FDP) to selectively privatize only sensitive features while retaining public visual cues. Finally, we propose a hybrid feature-projective DP framework that combines both approaches to balance privacy and accuracy for HPE. We evaluate our approach on the MPII dataset across varying privacy budgets, training strategies, and clipping norms. Our combined feature-projective method consistently outperforms vanilla DP-SGD and individual baselines, achieving up to 82.61\% mean PCKh@0.5 at $ε= 0.8$, substantially closing the gap to the non-private performance. This work lays foundation for privacy-preserving human pose estimation in real-world, sensitive applications.

LGMar 18, 2025
Predicting Cardiopulmonary Exercise Testing Outcomes in Congenital Heart Disease Through Multi-modal Data Integration and Geometric Learning

Muhammet Alkan, Gruschen Veldtman, Fani Deligianni

Cardiopulmonary exercise testing (CPET) provides a comprehensive assessment of functional capacity by measuring key physiological variables including oxygen consumption ($VO_2$), carbon dioxide production ($VCO_2$), and pulmonary ventilation ($VE$) during exercise. Previous research has established that parameters such as peak $VO_2$ and $VE/VCO_2$ ratio serve as robust predictors of mortality risk in chronic heart failure patients. In this study, we leverage CPET variables as surrogate mortality endpoints for patients with Congenital Heart Disease (CHD). To our knowledge, this represents the first successful implementation of an advanced machine learning approach that predicts CPET outcomes by integrating electrocardiograms (ECGs) with information derived from clinical letters. Our methodology began with extracting unstructured patient information-including intervention history, diagnoses, and medication regimens-from clinical letters using natural language processing techniques, organizing this data into a structured database. We then digitized ECGs to obtain quantifiable waveforms and established comprehensive data linkages. The core innovation of our approach lies in exploiting the Riemannian geometric properties of covariance matrices derived from both 12-lead ECGs and clinical text data to develop robust regression and classification models. Through extensive ablation studies, we demonstrated that the integration of ECG signals with clinical documentation, enhanced by covariance augmentation techniques in Riemannian space, consistently produced superior predictive performance compared to conventional approaches.

AIJan 16, 2025
Electronic Health Records: Towards Digital Twins in Healthcare

Muhammet Alkan, Hester Huijsdens, Yola Jones et al.

The pivotal shift from traditional paper-based records to sophisticated Electronic Health Records (EHR), enabled systematic collection and analysis of patient data through descriptive statistics, providing insight into patterns and trends across patient populations. This evolution continued toward predictive analytics, allowing healthcare providers to anticipate patient outcomes and potential complications before they occur. This progression from basic digital record-keeping to sophisticated predictive modelling and digital twins reflects healthcare's broader evolution toward more integrated, patient-centred approaches that combine data-driven insights with personalized care delivery. This chapter explores the evolution and significance of healthcare information systems, beginning with an examination of the implementation of EHR in the UK and the USA. It provides a comprehensive overview of the International Classification of Diseases (ICD) system, tracing its development from ICD-9 to ICD-10. Central to this discussion is the MIMIC-III database, a landmark achievement in healthcare data sharing and arguably the most comprehensive critical care database freely available to researchers worldwide. MIMIC-III has democratized access to high-quality healthcare data, enabling unprecedented opportunities for research and analysis. The chapter examines its structure, clinical outcome analysis capabilities, and practical applications through case studies, with a particular focus on mortality and length of stay metrics, vital signs extraction, and ICD coding. Through detailed entity-relationship diagrams and practical examples, the text illustrates MIMIC's complex data structure and demonstrates how different querying approaches can lead to subtly different results, emphasizing the critical importance of understanding the database's architecture for accurate data extraction.

NCMay 15, 2023
Towards personalised music-therapy; a neurocomputational modelling perspective

Nicole Lai, Marios Philiastides, Fahim Kawsar et al.

Music therapy has emerged recently as a successful intervention that improves patient's outcome in a large range of neurological and mood disorders without adverse effects. Brain networks are entrained to music in ways that can be explained both via top-down and bottom-up processes. In particular, the direct interaction of auditory with the motor and the reward system via a predictive framework explains the efficacy of music-based interventions in motor rehabilitation. In this manuscript, we provide a brief overview of current theories of music perception and processing. Subsequently, we summarise evidence of music-based interventions primarily in motor, emotional and cardiovascular regulation. We highlight opportunities to improve quality of life and reduce stress beyond the clinic environment and in healthy individuals. This relatively unexplored area requires an understanding of how we can personalise and automate music selection processes to fit individuals needs and tasks via feedback loops mediated by measurements of neuro-physiological responses.

SPJan 10, 2022
Improving ECG Classification Interpretability using Saliency Maps

Yola Jones, Fani Deligianni, Jeff Dalton

Cardiovascular disease is a large worldwide healthcare issue; symptoms often present suddenly with minimal warning. The electrocardiogram (ECG) is a fast, simple and reliable method of evaluating the health of the heart, by measuring electrical activity recorded through electrodes placed on the skin. ECGs often need to be analyzed by a cardiologist, taking time which could be spent on improving patient care and outcomes. Because of this, automatic ECG classification systems using machine learning have been proposed, which can learn complex interactions between ECG features and use this to detect abnormalities. However, algorithms built for this purpose often fail to generalize well to unseen data, reporting initially impressive results which drop dramatically when applied to new environments. Additionally, machine learning algorithms suffer a "black-box" issue, in which it is difficult to determine how a decision has been made. This is vital for applications in healthcare, as clinicians need to be able to verify the process of evaluation in order to trust the algorithm. This paper proposes a method for visualizing model decisions across each class in the MIT-BIH arrhythmia dataset, using adapted saliency maps averaged across complete classes to determine what patterns are being learned. We do this by building two algorithms based on state-of-the-art models. This paper highlights how these maps can be used to find problems in the model which could be affecting generalizability and model performance. Comparing saliency maps across complete classes gives an overall impression of confounding variables or other biases in the model, unlike what would be highlighted when comparing saliency maps on an ECG-by-ECG basis.

CVMay 9, 2021
Towards Explainable, Privacy-Preserved Human-Motion Affect Recognition

Matthew Malek-Podjaski, Fani Deligianni

Human motion characteristics are used to monitor the progression of neurological diseases and mood disorders. Since perceptions of emotions are also interleaved with body posture and movements, emotion recognition from human gait can be used to quantitatively monitor mood changes. Many existing solutions often use shallow machine learning models with raw positional data or manually extracted features to achieve this. However, gait is composed of many highly expressive characteristics that can be used to identify human subjects, and most solutions fail to address this, disregarding the subject's privacy. This work introduces a novel deep neural network architecture to disentangle human emotions and biometrics. In particular, we propose a cross-subject transfer learning technique for training a multi-encoder autoencoder deep neural network to learn disentangled latent representations of human motion features. By disentangling subject biometrics from the gait data, we show that the subject's privacy is preserved while the affect recognition performance outperforms traditional methods. Furthermore, we exploit Guided Grad-CAM to provide global explanations of the model's decision across gait cycles. We evaluate the effectiveness of our method to existing methods at recognizing emotions using both 3D temporal joint signals and manually extracted features. We also show that this data can easily be exploited to expose a subject's identity. Our method shows up to 7% improvement and highlights the joints with the most significant influence across the average gait cycle.

HCFeb 24, 2020
On-Orbit Operations Simulator for Workload Measurement during Telerobotic Training

Daniel Freer, Yao Guo, Fani Deligianni et al.

Training for telerobotic systems often makes heavy use of simulated platforms, which ensure safe operation during the learning process. Outer space is one domain in which such a simulated training platform would be useful, as On-Orbit Operations (O3) can be costly, inefficient, or even dangerous if not performed properly. In this paper, we present a new telerobotic training simulator for the Canadarm2 on the International Space Station (ISS), which is able to modulate workload through the addition of confounding factors such as latency, obstacles, and time pressure. In addition, multimodal physiological data is collected from subjects as they perform a task from the simulator under these different conditions. As most current workload measures are subjective, we analyse objective measures from the simulator and EEG data that can provide a reliable measure. ANOVA of task data revealed which simulator-based performance measures could predict the presence of latency and time pressure. Furthermore, EEG classification using a Riemannian classifier and Leave-One-Subject-Out cross-validation showed promising classification performance and allowed for comparison of different channel configurations and preprocessing methods. Additionally, Riemannian distance and beta power of EEG data were investigated as potential cross-trial and continuous workload measures.

HCJun 25, 2019
Intention Detection of Gait Adaptation in Natural Settings

Ines Domingos, Guang-Zhong Yang, Fani Deligianni

Gait adaptation is an important part of gait analysis and its neuronal origin and dynamics has been studied extensively. In neurorehabilitation, it is important as it perturbs neuronal dynamics and allows patients to restore some of their motor function. Exoskeletons and robotics of the lower limbs are increasingly used to facilitate rehabilitation as well as supporting daily function. Their efficiency and safety depends on how well can sense the human intention to move and adapt the gait accordingly. This paper presents a gait adaptation scheme in natural settings. It allows monitoring of subjects in more realistic environment without the requirement of specialized equipment such as treadmill and foot pressure sensors. We extract gait characteristics based on a single RBG camera whereas wireless EEG signals are monitored simultaneously. We demonstrate that the method can not only successfully detect adaptation steps but also detect efficiently whether the subject adjust their pace to higher or lower speed.

AIMar 28, 2018
Artificial Intelligence and Robotics

Javier Andreu-Perez, Fani Deligianni, Daniele Ravi et al.

The recent successes of AI have captured the wildest imagination of both the scientific communities and the general public. Robotics and AI amplify human potentials, increase productivity and are moving from simple reasoning towards human-like cognitive abilities. Current AI technologies are used in a set area of applications, ranging from healthcare, manufacturing, transport, energy, to financial services, banking, advertising, management consulting and government agencies. The global AI market is around 260 billion USD in 2016 and it is estimated to exceed 3 trillion by 2024. To understand the impact of AI, it is important to draw lessons from it's past successes and failures and this white paper provides a comprehensive explanation of the evolution of AI, its current status and future directions.