Payam Barnaghi

LG
h-index42
29papers
1,280citations
Novelty47%
AI Score32

29 Papers

APJul 20, 2023
A Markov Chain Model for Identifying Changes in Daily Activity Patterns of People Living with Dementia

Nan Fletcher-Lloyd, Alina-Irina Serban, Magdalena Kolanko et al.

Malnutrition and dehydration are strongly associated with increased cognitive and functional decline in people living with dementia (PLWD), as well as an increased rate of hospitalisations in comparison to their healthy counterparts. Extreme changes in eating and drinking behaviours can often lead to malnutrition and dehydration, accelerating the progression of cognitive and functional decline and resulting in a marked reduction in quality of life. Unfortunately, there are currently no established methods by which to objectively detect such changes. Here, we present the findings of an extensive quantitative analysis conducted on in-home monitoring data collected from 73 households of PLWD using Internet of Things technologies. The Coronavirus 2019 (COVID-19) pandemic has previously been shown to have dramatically altered the behavioural habits, particularly the eating and drinking habits, of PLWD. Using the COVID-19 pandemic as a natural experiment, we conducted linear mixed-effects modelling to examine changes in mean kitchen activity within a subset of 21 households of PLWD that were continuously monitored for 499 days. We report an observable increase in day-time kitchen activity and a significant decrease in night-time kitchen activity (t(147) = -2.90, p < 0.001). We further propose a novel analytical approach to detecting changes in behaviours of PLWD using Markov modelling applied to remote monitoring data as a proxy for behaviours that cannot be directly measured. Together, these results pave the way to introduce improvements into the monitoring of PLWD in naturalistic settings and for shifting from reactive to proactive care.

LGOct 4, 2022
Using Entropy Measures for Monitoring the Evolution of Activity Patterns

Yushan Huang, Yuchen Zhao, Hamed Haddadi et al.

In this work, we apply information theory inspired methods to quantify changes in daily activity patterns. We use in-home movement monitoring data and show how they can help indicate the occurrence of healthcare-related events. Three different types of entropy measures namely Shannon's entropy, entropy rates for Markov chains, and entropy production rate have been utilised. The measures are evaluated on a large-scale in-home monitoring dataset that has been collected within our dementia care clinical study. The study uses Internet of Things (IoT) enabled solutions for continuous monitoring of in-home activity, sleep, and physiology to develop care and early intervention solutions to support people living with dementia (PLWD) in their own homes. Our main goal is to show the applicability of the entropy measures to time-series activity data analysis and to use the extracted measures as new engineered features that can be fed into inference and analysis models. The results of our experiments show that in most cases the combination of these measures can indicate the occurrence of healthcare-related events. We also find that different participants with the same events may have different measures based on one entropy measure. So using a combination of these measures in an inference model will be more effective than any of the single measures.

AIFeb 22, 2023
Information Theory Inspired Pattern Analysis for Time-series Data

Yushan Huang, Yuchen Zhao, Alexander Capstick et al.

Current methods for pattern analysis in time series mainly rely on statistical features or probabilistic learning and inference methods to identify patterns and trends in the data. Such methods do not generalize well when applied to multivariate, multi-source, state-varying, and noisy time-series data. To address these issues, we propose a highly generalizable method that uses information theory-based features to identify and learn from patterns in multivariate time-series data. To demonstrate the proposed approach, we analyze pattern changes in human activity data. For applications with stochastic state transitions, features are developed based on Shannon's entropy of Markov chains, entropy rates of Markov chains, entropy production of Markov chains, and von Neumann entropy of Markov chains. For applications where state modeling is not applicable, we utilize five entropy variants, including approximate entropy, increment entropy, dispersion entropy, phase entropy, and slope entropy. The results show the proposed information theory-based features improve the recall rate, F1 score, and accuracy on average by up to 23.01% compared with the baseline models and a simpler model structure, with an average reduction of 18.75 times in the number of model parameters.

LGDec 6, 2022
Training Neural Networks on Data Sources with Unknown Reliability

Alexander Capstick, Francesca Palermo, Tianyu Cui et al.

When data is generated by multiple sources, conventional training methods update models assuming equal reliability for each source and do not consider their individual data quality. However, in many applications, sources have varied levels of reliability that can have negative effects on the performance of a neural network. A key issue is that often the quality of the data for individual sources is not known during training. Previous methods for training models in the presence of noisy data do not make use of the additional information that the source label can provide. Focusing on supervised learning, we aim to train neural networks on each data source for a number of steps proportional to the source's estimated reliability by using a dynamic re-weighting strategy motivated by likelihood tempering. This way, we allow training on all sources during the warm-up and reduce learning on less reliable sources during the final training stages, when it has been shown that models overfit to noise. We show through diverse experiments that this can significantly improve model performance when trained on mixtures of reliable and unreliable data sources, and maintain performance when models are trained on reliable sources only.

SYDec 28, 2017
Detecting Changes in Time Series Data using Volatility Filters

Alireza Ahrabian, Nazli Farajidavar, Clive Cheong-Took et al.

This work develops techniques for the sequential detection and location estimation of transient changes in the volatility (standard deviation) of time series data. In particular, we introduce a class of change detection algorithms based on the windowed volatility filter. The first method detects changes by employing a convex combination of two such filters with differing window sizes, such that the adaptively updated convex weight parameter is then used as an indicator for the detection of instantaneous power changes. Moreover, the proposed adaptive filtering based method is readily extended to the multivariate case by using recent advances in distributed adaptive filters, thereby using cooperation between the data channels for more effective detection of change points. Furthermore, this work also develops a novel change point location estimator based on the differenced output of the volatility filter. Finally, the performance of the proposed methods were evaluated on both synthetic and real world data.

LGNov 29, 2023
Interpreting Differentiable Latent States for Healthcare Time-series Data

Yu Chen, Nivedita Bijlani, Samaneh Kouchaki et al.

Machine learning enables extracting clinical insights from large temporal datasets. The applications of such machine learning models include identifying disease patterns and predicting patient outcomes. However, limited interpretability poses challenges for deploying advanced machine learning in digital healthcare. Understanding the meaning of latent states is crucial for interpreting machine learning models, assuming they capture underlying patterns. In this paper, we present a concise algorithm that allows for i) interpreting latent states using highly related input features; ii) interpreting predictions using subsets of input features via latent states; and iii) interpreting changes in latent states over time. The proposed algorithm is feasible for any model that is differentiable. We demonstrate that this approach enables the identification of a daytime behavioral pattern for predicting nocturnal behavior in a real-world healthcare dataset.

LGNov 26, 2024
AutoElicit: Using Large Language Models for Expert Prior Elicitation in Predictive Modelling

Alexander Capstick, Rahul G. Krishnan, Payam Barnaghi

Large language models (LLMs) acquire a breadth of information across various domains. However, their computational complexity, cost, and lack of transparency often hinder their direct application for predictive tasks where privacy and interpretability are paramount. In fields such as healthcare, biology, and finance, specialised and interpretable linear models still hold considerable value. In such domains, labelled data may be scarce or expensive to obtain. Well-specified prior distributions over model parameters can reduce the sample complexity of learning through Bayesian inference; however, eliciting expert priors can be time-consuming. We therefore introduce AutoElicit to extract knowledge from LLMs and construct priors for predictive models. We show these priors are informative and can be refined using natural language. We perform a careful study contrasting AutoElicit with in-context learning and demonstrate how to perform model selection between the two methods. We find that AutoElicit yields priors that can substantially reduce error over uninformative priors, using fewer labels, and consistently outperform in-context learning. We show that AutoElicit saves over 6 months of labelling effort when building a new predictive model for urinary tract infections from sensor recordings of people living with dementia.

LGMay 7, 2024
Representation Learning of Daily Movement Data Using Text Encoders

Alexander Capstick, Tianyu Cui, Yu Chen et al.

Time-series representation learning is a key area of research for remote healthcare monitoring applications. In this work, we focus on a dataset of recordings of in-home activity from people living with Dementia. We design a representation learning method based on converting activity to text strings that can be encoded using a language model fine-tuned to transform data from the same participants within a $30$-day window to similar embeddings in the vector space. This allows for clustering and vector searching over participants and days, and the identification of activity deviations to aid with personalised delivery of care.

LGMar 12, 2024
Low-Energy On-Device Personalization for MCUs

Yushan Huang, Ranya Aloufi, Xavier Cadet et al.

Microcontroller Units (MCUs) are ideal platforms for edge applications due to their low cost and energy consumption, and are widely used in various applications, including personalized machine learning tasks, where customized models can enhance the task adaptation. However, existing approaches for local on-device personalization mostly support simple ML architectures or require complex local pre-training/training, leading to high energy consumption and negating the low-energy advantage of MCUs. In this paper, we introduce $MicroT$, an efficient and low-energy MCU personalization approach. $MicroT$ includes a robust, general, but tiny feature extractor, developed through self-supervised knowledge distillation, which trains a task-specific head to enable independent on-device personalization with minimal energy and computational requirements. MicroT implements an MCU-optimized early-exit inference mechanism called stage-decision to further reduce energy costs. This mechanism allows for user-configurable exit criteria (stage-decision ratio) to adaptively balance energy cost with model performance. We evaluated MicroT using two models, three datasets, and two MCU boards. $MicroT$ outperforms traditional transfer learning (TTL) and two SOTA approaches by 2.12 - 11.60% across two models and three datasets. Targeting widely used energy-aware edge devices, MicroT's on-device training requires no additional complex operations, halving the energy cost compared to SOTA approaches by up to 2.28X while keeping SRAM usage below 1MB. During local inference, MicroT reduces energy cost by 14.17% compared to TTL across two boards and two datasets, highlighting its suitability for long-term use on energy-aware resource-constrained MCUs.

SPFeb 18, 2025
Urinary Tract Infection Detection in Digital Remote Monitoring: Strategies for Managing Participant-Specific Prediction Complexity

Kexin Fan, Alexander Capstick, Ramin Nilforooshan et al.

Urinary tract infections (UTIs) are a significant health concern, particularly for people living with dementia (PLWD), as they can lead to severe complications if not detected and treated early. This study builds on previous work that utilised machine learning (ML) to detect UTIs in PLWD by analysing in-home activity and physiological data collected through low-cost, passive sensors. The current research focuses on improving the performance of previous models, particularly by refining the Multilayer Perceptron (MLP), to better handle variations in home environments and improve sex fairness in predictions by making use of concepts from multitask learning. This study implemented three primary model designs: feature clustering, loss-dependent clustering, and participant ID embedding which were compared against a baseline MLP model. The results demonstrated that the loss-dependent MLP achieved the most significant improvements, increasing validation precision from 48.92% to 72.60% and sensitivity from 27.44% to 70.52%, while also enhancing model fairness across sexes. These findings suggest that the refined models offer a more reliable and equitable approach to early UTI detection in PLWD, addressing participant-specific data variations and enabling clinicians to detect and screen for UTI risks more effectively, thereby facilitating earlier and more accurate treatment decisions.

AIFeb 14, 2025
Analyzing Patient Daily Movement Behavior Dynamics Using Two-Stage Encoding Model

Jin Cui, Alexander Capstick, Payam Barnaghi et al.

In the analysis of remote healthcare monitoring data, time series representation learning offers substantial value in uncovering deeper patterns of patient behavior, especially given the fine temporal granularity of the data. In this study, we focus on a dataset of home activity records from people living with Dementia. We propose a two-stage self-supervised learning approach. The first stage involves converting time-series activities into text strings, which are then encoded by a fine-tuned language model. In the second stage, these time-series vectors are bi-dimensionalized for applying PageRank method, to analyze latent state transitions to quantitatively assess participants behavioral patterns and identify activity biases. These insights, combined with diagnostic data, aim to support personalized care interventions.

LGJan 30, 2025
Evaluating Spoken Language as a Biomarker for Automated Screening of Cognitive Impairment

Maria R. Lima, Alexander Capstick, Fatemeh Geranmayeh et al.

Timely and accurate assessment of cognitive impairment is a major unmet need in populations at risk. Alterations in speech and language can be early predictors of Alzheimer's disease and related dementias (ADRD) before clinical signs of neurodegeneration. Voice biomarkers offer a scalable and non-invasive solution for automated screening. However, the clinical applicability of machine learning (ML) remains limited by challenges in generalisability, interpretability, and access to patient data to train clinically applicable predictive models. Using DementiaBank recordings (N=291, 64% female), we evaluated ML techniques for ADRD screening and severity prediction from spoken language. We validated model generalisability with pilot data collected in-residence from older adults (N=22, 59% female). Risk stratification and linguistic feature importance analysis enhanced the interpretability and clinical utility of predictions. For ADRD classification, a Random Forest applied to lexical features achieved a mean sensitivity of 69.4% (95% confidence interval (CI) = 66.4-72.5) and specificity of 83.3% (78.0-88.7). On real-world pilot data, this model achieved a mean sensitivity of 70.0% (58.0-82.0) and specificity of 52.5% (39.3-65.7). For severity prediction using Mini-Mental State Examination (MMSE) scores, a Random Forest Regressor achieved a mean absolute MMSE error of 3.7 (3.7-3.8), with comparable performance of 3.3 (3.1-3.5) on pilot data. Linguistic features associated with higher ADRD risk included increased use of pronouns and adverbs, greater disfluency, reduced analytical thinking, lower lexical diversity and fewer words reflecting a psychological state of completion. Our interpretable predictive modelling offers a novel approach for in-home integration with conversational AI to monitor cognitive health and triage higher-risk individuals, enabling earlier detection and intervention.

LGFeb 13, 2025
Two-Stage Representation Learning for Analyzing Movement Behavior Dynamics in People Living with Dementia

Jin Cui, Alexander Capstick, Payam Barnaghi et al.

In remote healthcare monitoring, time series representation learning reveals critical patient behavior patterns from high-frequency data. This study analyzes home activity data from individuals living with dementia by proposing a two-stage, self-supervised learning approach tailored to uncover low-rank structures. The first stage converts time-series activities into text sequences encoded by a pre-trained language model, providing a rich, high-dimensional latent state space using a PageRank-based method. This PageRank vector captures latent state transitions, effectively compressing complex behaviour data into a succinct form that enhances interpretability. This low-rank representation not only enhances model interpretability but also facilitates clustering and transition analysis, revealing key behavioral patterns correlated with clinicalmetrics such as MMSE and ADAS-COG scores. Our findings demonstrate the framework's potential in supporting cognitive status prediction, personalized care interventions, and large-scale health monitoring.

LGJun 25, 2024
Enabling Regional Explainability by Automatic and Model-agnostic Rule Extraction

Yu Chen, Tianyu Cui, Alexander Capstick et al.

In Explainable AI, rule extraction translates model knowledge into logical rules, such as IF-THEN statements, crucial for understanding patterns learned by black-box models. This could significantly aid in fields like disease diagnosis, disease progression estimation, or drug discovery. However, such application domains often contain imbalanced data, with the class of interest underrepresented. Existing methods inevitably compromise the performance of rules for the minor class to maximise the overall performance. As the first attempt in this field, we propose a model-agnostic approach for extracting rules from specific subgroups of data, featuring automatic rule generation for numerical features. This method enhances the regional explainability of machine learning models and offers wider applicability compared to existing methods. We additionally introduce a new method for selecting features to compose rules, reducing computational costs in high-dimensional spaces. Experiments across various datasets and models demonstrate the effectiveness of our methods.

LGOct 19, 2021
Designing A Clinically Applicable Deep Recurrent Model to Identify Neuropsychiatric Symptoms in People Living with Dementia Using In-Home Monitoring Data

Francesca Palermo, Honglin Li, Alexander Capstick et al.

Agitation is one of the neuropsychiatric symptoms with high prevalence in dementia which can negatively impact the Activities of Daily Living (ADL) and the independence of individuals. Detecting agitation episodes can assist in providing People Living with Dementia (PLWD) with early and timely interventions. Analysing agitation episodes will also help identify modifiable factors such as ambient temperature and sleep as possible components causing agitation in an individual. This preliminary study presents a supervised learning model to analyse the risk of agitation in PLWD using in-home monitoring data. The in-home monitoring data includes motion sensors, physiological measurements, and the use of kitchen appliances from 46 homes of PLWD between April 2019-June 2021. We apply a recurrent deep learning model to identify agitation episodes validated and recorded by a clinical monitoring team. We present the experiments to assess the efficacy of the proposed model. The proposed model achieves an average of 79.78% recall, 27.66% precision and 37.64% F1 scores when employing the optimal parameters, suggesting a good ability to recognise agitation events. We also discuss using machine learning models for analysing the behavioural patterns using continuous monitoring data and explore clinical applicability and the choices between sensitivity and specificity in-home monitoring applications.

LGSep 10, 2021
Multimodal Federated Learning on IoT Data

Yuchen Zhao, Payam Barnaghi, Hamed Haddadi

Federated learning is proposed as an alternative to centralized machine learning since its client-server structure provides better privacy protection and scalability in real-world applications. In many applications, such as smart homes with Internet-of-Things (IoT) devices, local data on clients are generated from different modalities such as sensory, visual, and audio data. Existing federated learning systems only work on local data from a single modality, which limits the scalability of the systems. In this paper, we propose a multimodal and semi-supervised federated learning framework that trains autoencoders to extract shared or correlated representations from different local data modalities on clients. In addition, we propose a multimodal FedAvg algorithm to aggregate local autoencoders trained on different data modalities. We use the learned global autoencoder for a downstream classification task with the help of auxiliary labelled data on the server. We empirically evaluate our framework on different modalities including sensory data, depth camera videos, and RGB camera videos. Our experimental results demonstrate that introducing data from multiple modalities into federated learning can improve its classification performance. In addition, we can use labelled data from only one modality for supervised learning on the server and apply the learned model to testing data from other modalities to achieve decent F1 scores (e.g., with the best performance being higher than 60%), especially when combining contributions from both unimodal clients and multimodal clients.

LGMar 3, 2021
A Hamiltonian Monte Carlo Model for Imputation and Augmentation of Healthcare Data

Narges Pourshahrokhi, Samaneh Kouchaki, Kord M. Kober et al.

Missing values exist in nearly all clinical studies because data for a variable or question are not collected or not available. Inadequate handling of missing values can lead to biased results and loss of statistical power in analysis. Existing models usually do not consider privacy concerns or do not utilise the inherent correlations across multiple features to impute the missing values. In healthcare applications, we are usually confronted with high dimensional and sometimes small sample size datasets that need more effective augmentation or imputation techniques. Besides, imputation and augmentation processes are traditionally conducted individually. However, imputing missing values and augmenting data can significantly improve generalisation and avoid bias in machine learning models. A Bayesian approach to impute missing values and creating augmented samples in high dimensional healthcare data is proposed in this work. We propose folded Hamiltonian Monte Carlo (F-HMC) with Bayesian inference as a more practical approach to process the cross-dimensional relations by applying a random walk and Hamiltonian dynamics to adapt posterior distribution and generate large-scale samples. The proposed method is applied to a cancer symptom assessment dataset and confirmed to enrich the quality of data in precision, accuracy, recall, F1 score, and propensity metric.

AIJan 18, 2021
An attention model to analyse the risk of agitation and urinary tract infections in people with dementia

Honglin Li, Roonak Rezvani, Magdalena Anita Kolanko et al.

Behavioural symptoms and urinary tract infections (UTI) are among the most common problems faced by people with dementia. One of the key challenges in the management of these conditions is early detection and timely intervention in order to reduce distress and avoid unplanned hospital admissions. Using in-home sensing technologies and machine learning models for sensor data integration and analysis provides opportunities to detect and predict clinically significant events and changes in health status. We have developed an integrated platform to collect in-home sensor data and performed an observational study to apply machine learning models for agitation and UTI risk analysis. We collected a large dataset from 88 participants with a mean age of 82 and a standard deviation of 6.5 (47 females and 41 males) to evaluate a new deep learning model that utilises attention and rational mechanism. The proposed solution can process a large volume of data over a period of time and extract significant patterns in a time-series data (i.e. attention) and use the extracted features and patterns to train risk analysis models (i.e. rational). The proposed model can explain the predictions by indicating which time-steps and features are used in a long series of time-series data. The model provides a recall of 91\% and precision of 83\% in detecting the risk of agitation and UTIs. This model can be used for early detection of conditions such as UTIs and managing of neuropsychiatric symptoms such as agitation in association with initial treatment and early intervention approaches. In our study we have developed a set of clinical pathways for early interventions using the alerts generated by the proposed model and a clinical monitoring team has been set up to use the platform and respond to the alerts according to the created intervention plans.

RODec 8, 2020
Emotive Response to a Hybrid-Face Robot and Translation to Consumer Social Robots

Maitreyee Wairagkar, Maria R Lima, Daniel Bazo et al.

We introduce the conceptual formulation, design, fabrication, control and commercial translation with IoT connection of a hybrid-face social robot and validation of human emotional response to its affective interactions. The hybrid-face robot integrates a 3D printed faceplate and a digital display to simplify conveyance of complex facial movements while providing the impression of three-dimensional depth for natural interaction. We map the space of potential emotions of the robot to specific facial feature parameters and characterise the recognisability of the humanoid hybrid-face robot's archetypal facial expressions. We introduce pupil dilation as an additional degree of freedom for conveyance of emotive states. Human interaction experiments demonstrate the ability to effectively convey emotion from the hybrid-robot face to human observers by mapping their neurophysiological electroencephalography (EEG) response to perceived emotional information and through interviews. Results show main hybrid-face robotic expressions can be discriminated with recognition rates above 80% and invoke human emotive response similar to that of actual human faces as measured by the face-specific N170 event-related potentials in EEG. The hybrid-face robot concept has been modified, implemented, and released in the commercial IoT robotic platform Miko (My Companion), an affective robot with facial and conversational features currently in use for human-robot interaction in children by Emotix Inc. We demonstrate that human EEG responses to Miko emotions are comparative to neurophysiological responses for actual human facial recognition. Finally, interviews show above 90% expression recognition rates in our commercial robot. We conclude that simplified hybrid-face abstraction conveys emotions effectively and enhances human-robot interaction.

LGNov 27, 2020
Deep Representation for Connected Health: Semi-supervised Learning for Analysing the Risk of Urinary Tract Infections in People with Dementia

Honglin Li, Magdalena Anita Kolanko, Shirin Enshaeifar et al.

Machine learning techniques combined with in-home monitoring technologies provide a unique opportunity to automate diagnosis and early detection of adverse health conditions in long-term conditions such as dementia. However, accessing sufficient labelled training samples and integrating high-quality, routinely collected data from heterogeneous in-home monitoring technologies are main obstacles hindered utilising these technologies in real-world medicine. This work presents a semi-supervised model that can continuously learn from routinely collected in-home observation and measurement data. We show how our model can process highly imbalanced and dynamic data to make robust predictions in analysing the risk of Urinary Tract Infections (UTIs) in dementia. UTIs are common in older adults and constitute one of the main causes of avoidable hospital admissions in people with dementia (PwD). Health-related conditions, such as UTI, have a lower prevalence in individuals, which classifies them as sporadic cases (i.e. rare or scattered, yet important events). This limits the access to sufficient training data, without which the supervised learning models risk becoming overfitted or biased. We introduce a probabilistic semi-supervised learning framework to address these issues. The proposed method produces a risk analysis score for UTIs using routinely collected data by in-home sensing technologies.

LGNov 2, 2020
Semi-supervised Federated Learning for Activity Recognition

Yuchen Zhao, Hanyang Liu, Honglin Li et al.

Training deep learning models on in-home IoT sensory data is commonly used to recognise human activities. Recently, federated learning systems that use edge devices as clients to support local human activity recognition have emerged as a new paradigm to combine local (individual-level) and global (group-level) models. This approach provides better scalability and generalisability and also offers better privacy compared with the traditional centralised analysis and learning models. The assumption behind federated learning, however, relies on supervised learning on clients. This requires a large volume of labelled data, which is difficult to collect in uncontrolled IoT environments such as remote in-home monitoring. In this paper, we propose an activity recognition system that uses semi-supervised federated learning, wherein clients conduct unsupervised learning on autoencoders with unlabelled local data to learn general representations, and a cloud server conducts supervised learning on an activity classifier with labelled data. Our experimental results show that using a long short-term memory autoencoder and a Softmax classifier, the accuracy of our proposed system is higher than that of both centralised systems and semi-supervised federated learning using data augmentation. The accuracy is also comparable to that of supervised federated learning systems. Meanwhile, we demonstrate that our system can reduce the number of needed labels and the size of local models, and has faster local activity recognition speed than supervised federated learning does.

LGOct 19, 2020
Verifying the Causes of Adversarial Examples

Honglin Li, Yifei Fan, Frieder Ganz et al.

The robustness of neural networks is challenged by adversarial examples that contain almost imperceptible perturbations to inputs, which mislead a classifier to incorrect outputs in high confidence. Limited by the extreme difficulty in examining a high-dimensional image space thoroughly, research on explaining and justifying the causes of adversarial examples falls behind studies on attacks and defenses. In this paper, we present a collection of potential causes of adversarial examples and verify (or partially verify) them through carefully-designed controlled experiments. The major causes of adversarial examples include model linearity, one-sum constraint, and geometry of the categories. To control the effect of those causes, multiple techniques are applied such as $L_2$ normalization, replacement of loss functions, construction of reference datasets, and novel models using multi-layer perceptron probabilistic neural networks (MLP-PNN) and density estimation (DE). Our experiment results show that geometric factors tend to be more direct causes and statistical factors magnify the phenomenon, especially for assigning high prediction confidence. We believe this paper will inspire more studies to rigorously investigate the root causes of adversarial examples, which in turn provide useful guidance on designing more robust models.

LGMay 8, 2020
Continual Learning Using Multi-view Task Conditional Neural Networks

Honglin Li, Payam Barnaghi, Shirin Enshaeifar et al.

Conventional deep learning models have limited capacity in learning multiple tasks sequentially. The issue of forgetting the previously learned tasks in continual learning is known as catastrophic forgetting or interference. When the input data or the goal of learning change, a continual model will learn and adapt to the new status. However, the model will not remember or recognise any revisits to the previous states. This causes performance reduction and re-training curves in dealing with periodic or irregularly reoccurring changes in the data or goals. The changes in goals or data are referred to as new tasks in a continual learning model. Most of the continual learning methods have a task-known setup in which the task identities are known in advance to the learning model. We propose Multi-view Task Conditional Neural Networks (Mv-TCNN) that does not require to known the reoccurring tasks in advance. We evaluate our model on standard datasets using MNIST, CIFAR10, CIFAR100, and also a real-world dataset that we have collected in a remote healthcare monitoring study (i.e. TIHM dataset). The proposed model outperforms the state-of-the-art solutions in continual learning and adapting to new tasks that are not defined in advance.

LGOct 9, 2019
Continual Learning Using Bayesian Neural Networks

HongLin Li, Payam Barnaghi, Shirin Enshaeifar et al.

Continual learning models allow to learn and adapt to new changes and tasks over time. However, in continual and sequential learning scenarios in which the models are trained using different data with various distributions, neural networks tend to forget the previously learned knowledge. This phenomenon is often referred to as catastrophic forgetting. The catastrophic forgetting is an inevitable problem in continual learning models for dynamic environments. To address this issue, we propose a method, called Continual Bayesian Learning Networks (CBLN), which enables the networks to allocate additional resources to adapt to new tasks without forgetting the previously learned tasks. Using a Bayesian Neural Network, CBLN maintains a mixture of Gaussian posterior distributions that are associated with different tasks. The proposed method tries to optimise the number of resources that are needed to learn each task and avoids an exponential increase in the number of resources that are involved in learning multiple tasks. The proposed method does not need to access the past training data and can choose suitable weights to classify the data points during the test time automatically based on an uncertainty criterion. We have evaluated our method on the MNIST and UCR time-series datasets. The evaluation results show that our method can address the catastrophic forgetting problem at a promising rate compared to the state-of-the-art models.

LGMay 20, 2019
Continual Learning in Deep Neural Network by Using a Kalman Optimiser

Honglin Li, Shirin Enshaeifar, Frieder Ganz et al.

Learning and adapting to new distributions or learning new tasks sequentially without forgetting the previously learned knowledge is a challenging phenomenon in continual learning models. Most of the conventional deep learning models are not capable of learning new tasks sequentially in one model without forgetting the previously learned ones. We address this issue by using a Kalman Optimiser. The Kalman Optimiser divides the neural network into two parts: the long-term and short-term memory units. The long-term memory unit is used to remember the learned tasks and the short-term memory unit is to adapt to the new task. We have evaluated our method on MNIST, CIFAR10, CIFAR100 datasets and compare our results with state-of-the-art baseline models. The results show that our approach enables the model to continually learn and adapt to the new changes without forgetting the previously learned tasks.

LGNov 6, 2018
Kalman Filter Modifier for Neural Networks in Non-stationary Environments

Honglin Li, Frieder Ganz, Shirin Enshaeifar et al.

Learning in a non-stationary environment is an inevitable problem when applying machine learning algorithm to real world environment. Learning new tasks without forgetting the previous knowledge is a challenge issue in machine learning. We propose a Kalman Filter based modifier to maintain the performance of Neural Network models under non-stationary environments. The result shows that our proposed model can preserve the key information and adapts better to the changes. The accuracy of proposed model decreases by 0.4% in our experiments, while the accuracy of conventional model decreases by 90% in the drifts environment.

LGFeb 17, 2018
Machine learning for Internet of Things data analysis: A survey

Mohammad Saeid Mahdavinejad, Mohammadreza Rezvan, Mohammadamin Barekatain et al.

Rapid developments in hardware, software, and communication technologies have allowed the emergence of Internet-connected sensory devices that provide observation and data measurement from the physical world. By 2020, it is estimated that the total number of Internet-connected devices being used will be between 25 and 50 billion. As the numbers grow and technologies become more mature, the volume of data published will increase. Internet-connected devices technology, referred to as Internet of Things (IoT), continues to extend the current Internet by providing connectivity and interaction between the physical and cyber worlds. In addition to increased volume, the IoT generates Big Data characterized by velocity in terms of time and location dependency, with a variety of multiple modalities and varying data quality. Intelligent processing and analysis of this Big Data is the key to developing smart IoT applications. This article assesses the different machine learning methods that deal with the challenges in IoT data by considering smart cities as the main use case. The key contribution of this study is presentation of a taxonomy of machine learning algorithms explaining how different techniques are applied to the data in order to extract higher level information. The potential and challenges of machine learning for IoT data analytics will also be discussed. A use case of applying Support Vector Machine (SVM) on Aarhus Smart City traffic data is presented for a more detailed exploration.

LGOct 26, 2017
Segment Parameter Labelling in MCMC Mean-Shift Change Detection

Alireza Ahrabian, Shirin Enshaeifar, Clive Cheong-Took et al.

This work addresses the problem of segmentation in time series data with respect to a statistical parameter of interest in Bayesian models. It is common to assume that the parameters are distinct within each segment. As such, many Bayesian change point detection models do not exploit the segment parameter patterns, which can improve performance. This work proposes a Bayesian mean-shift change point detection algorithm that makes use of repetition in segment parameters, by introducing segment class labels that utilise a Dirichlet process prior. The performance of the proposed approach was assessed on both synthetic and real world data, highlighting the enhanced performance when using parameter labelling.

SIMay 28, 2017
A Deep Multi-View Learning Framework for City Event Extraction from Twitter Data Streams

Nazli Farajidavar, Sefki Kolozali, Payam Barnaghi

Cities have been a thriving place for citizens over the centuries due to their complex infrastructure. The emergence of the Cyber-Physical-Social Systems (CPSS) and context-aware technologies boost a growing interest in analysing, extracting and eventually understanding city events which subsequently can be utilised to leverage the citizen observations of their cities. In this paper, we investigate the feasibility of using Twitter textual streams for extracting city events. We propose a hierarchical multi-view deep learning approach to contextualise citizen observations of various city systems and services. Our goal has been to build a flexible architecture that can learn representations useful for tasks, thus avoiding excessive task-specific feature engineering. We apply our approach on a real-world dataset consisting of event reports and tweets of over four months from San Francisco Bay Area dataset and additional datasets collected from London. The results of our evaluations show that our proposed solution outperforms the existing models and can be used for extracting city related events with an averaged accuracy of 81% over all classes. To further evaluate the impact of our Twitter event extraction model, we have used two sources of authorised reports through collecting road traffic disruptions data from Transport for London API, and parsing the Time Out London website for sociocultural events. The analysis showed that 49.5% of the Twitter traffic comments are reported approximately five hours prior to the authorities official records. Moreover, we discovered that amongst the scheduled sociocultural event topics; tweets reporting transportation, cultural and social events are 31.75% more likely to influence the distribution of the Twitter comments than sport, weather and crime topics.