Andrew Campbell

LG
h-index108
24papers
1,149citations
Novelty53%
AI Score59

24 Papers

MLMay 30, 2022
A Continuous Time Framework for Discrete Denoising Models

Andrew Campbell, Joe Benton, Valentin De Bortoli et al. · oxford

We provide the first complete continuous time framework for denoising diffusion models of discrete data. This is achieved by formulating the forward noising process and corresponding reverse time generative process as Continuous Time Markov Chains (CTMCs). The model can be efficiently trained using a continuous time version of the ELBO. We simulate the high dimensional CTMC using techniques developed in chemical physics and exploit our continuous time framework to derive high performance samplers that we show can outperform discrete time methods for discrete data. The continuous time treatment also enables us to derive a novel theoretical result bounding the error between the generated sample distribution and the true data distribution.

LGJan 26, 2023
normflows: A PyTorch Package for Normalizing Flows

Vincent Stimper, David Liu, Andrew Campbell et al. · cambridge

Normalizing flows model probability distributions through an expressive tractable density. They transform a simple base distribution, such as a Gaussian, through a sequence of invertible functions, which are referred to as layers. These layers typically use neural networks to become very expressive. Flows are ubiquitous in machine learning and have been applied to image generation, text modeling, variational inference, approximating Boltzmann distributions, and many other problems. Here, we present normflows, a Python package for normalizing flows. It allows to build normalizing flow models from a suite of base distributions, flow layers, and neural networks. The package is implemented in the popular deep learning framework PyTorch, which simplifies the integration of flows in larger machine learning models or pipelines. It supports most of the common normalizing flow architectures, such as Real NVP, Glow, Masked Autoregressive Flows, Neural Spline Flows, Residual Flows, and many more. The package can be easily installed via pip and the code is publicly available on GitHub.

LGMay 30
Graph Transfer Learning via Shared Latent Geometry: Theory and Applications

Tong Wu, Andrew Campbell, Anna Scaglione

Inference and control in engineered physical systems pay a heavy physics cost at deployment: state estimators, inverse-problem solvers, model-predictive controllers, schedulers, and observers are often not closed-form and must re-solve a numerical optimization per instance, with the operator re-supplied each time. Physics-informed learning moves this cost to training, but uses a single encoder pathway whose latent geometry de-learns under fine-tuning and admits no quantitative transfer guarantee. We propose an asymmetric two-pathway architecture that resolves both issues. A teacher encoder consumes privileged dense states from a high-fidelity simulator and represents the system through operator-polynomial features stable under spectral perturbation; a student encoder learns the same latent geometry from sparse field data and operator descriptors. At deployment the teacher is discarded, and the frozen student runs in a single forward pass with a transfer certificate. The design connects to privileged-information learning, knowledge distillation, and cross-modal distillation, but targets cross-instance transfer rather than fixed-instance prediction: topology and operator may change, while the latent task does not. We establish sufficient and near-necessary transfer conditions via Wasserstein proximity between latent laws, yielding a zero-shot error bound, and develop a finite-sample certification protocol with active expansion when coverage is incomplete. The framework applies wherever a system admits an operator with reportable spectrum. On power-system estimation, it achieves zero-shot transfer to 100 unseen topologies, a 95% certificate pass rate, accuracy competitive with topology-aware Newton--Raphson, and sub-millisecond inference. These results suggest asymmetric pathways plus operator-anchored latent geometry provide a foundation for certified zero-shot inference and control.

MLMar 29, 2023
Diffusion Schrödinger Bridge Matching

Yuyang Shi, Valentin De Bortoli, Andrew Campbell et al.

Solving transport problems, i.e. finding a map transporting one given distribution to another, has numerous applications in machine learning. Novel mass transport methods motivated by generative modeling have recently been proposed, e.g. Denoising Diffusion Models (DDMs) and Flow Matching Models (FMMs) implement such a transport through a Stochastic Differential Equation (SDE) or an Ordinary Differential Equation (ODE). However, while it is desirable in many applications to approximate the deterministic dynamic Optimal Transport (OT) map which admits attractive properties, DDMs and FMMs are not guaranteed to provide transports close to the OT map. In contrast, Schrödinger bridges (SBs) compute stochastic dynamic mappings which recover entropy-regularized versions of OT. Unfortunately, existing numerical methods approximating SBs either scale poorly with dimension or accumulate errors across iterations. In this work, we introduce Iterative Markovian Fitting (IMF), a new methodology for solving SB problems, and Diffusion Schrödinger Bridge Matching (DSBM), a novel numerical algorithm for computing IMF iterates. DSBM significantly improves over previous SB numerics and recovers as special/limiting cases various recent transport methods. We demonstrate the performance of DSBM on a variety of problems.

AIMar 12Code
Learning Transferable Sensor Models via Language-Informed Pretraining

Yuliang Chen, Arvind Pillai, Yu Yvonne Wu et al.

Modern sensing systems generate large volumes of unlabeled multivariate time-series data. This abundance of unlabeled data makes self-supervised learning (SSL) a natural approach for learning transferable representations. However, most existing approaches are optimized for reconstruction or forecasting objectives and often fail to capture the semantic structure required for downstream classification and reasoning tasks. While recent sensor-language alignment methods improve semantic generalization through captioning and zero-shot transfer, they are limited to fixed sensor configurations, such as predefined channel sets, signal lengths, or temporal resolutions, which hinders cross-domain applicability. To address these gaps, we introduce \textbf{SLIP} (\textbf{S}ensor \textbf{L}anguage-\textbf{I}nformed \textbf{P}retraining), an open-source framework for learning language-aligned representations that generalize across diverse sensor setups. SLIP integrates contrastive alignment with sensor-conditioned captioning, facilitating both discriminative understanding and generative reasoning. By repurposing a pretrained decoder-only language model via cross-attention and introducing an elegant, flexible patch-embedder, SLIP supports different temporal resolutions and variable-length input at inference time without additional retraining. Across 11 datasets, SLIP demonstrates superior performance in zero-shot transfer, signal captioning, and question answering. It achieves a 77.14% average linear-probing accuracy, a 5.93% relative improvement over strong baselines, and reaches 64.83% accuracy in sensor-based question answering.

IMAug 14, 2023
Space Object Identification and Classification from Hyperspectral Material Analysis

Massimiliano Vasile, Lewis Walker, Andrew Campbell et al.

This paper presents a data processing pipeline designed to extract information from the hyperspectral signature of unknown space objects. The methodology proposed in this paper determines the material composition of space objects from single pixel images. Two techniques are used for material identification and classification: one based on machine learning and the other based on a least square match with a library of known spectra. From this information, a supervised machine learning algorithm is used to classify the object into one of several categories based on the detection of materials on the object. The behaviour of the material classification methods is investigated under non-ideal circumstances, to determine the effect of weathered materials, and the behaviour when the training library is missing a material that is present in the object being observed. Finally the paper will present some preliminary results on the identification and classification of space objects.

QMJan 8, 2024Code
Improved motif-scaffolding with SE(3) flow matching

Jason Yim, Andrew Campbell, Emile Mathieu et al.

Protein design often begins with the knowledge of a desired function from a motif which motif-scaffolding aims to construct a functional protein around. Recently, generative models have achieved breakthrough success in designing scaffolds for a range of motifs. However, generated scaffolds tend to lack structural diversity, which can hinder success in wet-lab validation. In this work, we extend FrameFlow, an SE(3) flow matching model for protein backbone generation, to perform motif-scaffolding with two complementary approaches. The first is motif amortization, in which FrameFlow is trained with the motif as input using a data augmentation strategy. The second is motif guidance, which performs scaffolding using an estimate of the conditional score from FrameFlow without additional training. On a benchmark of 24 biologically meaningful motifs, we show our method achieves 2.5 times more designable and unique motif-scaffolds compared to state-of-the-art. Code: https://github.com/microsoft/protein-frame-flow

CLDec 28, 2025
LENS: LLM-Enabled Narrative Synthesis for Mental Health by Aligning Multimodal Sensing with Language Models

Wenxuan Xu, Arvind Pillai, Subigya Nepal et al.

Multimodal health sensing offers rich behavioral signals for assessing mental health, yet translating these numerical time-series measurements into natural language remains challenging. Current LLMs cannot natively ingest long-duration sensor streams, and paired sensor-text datasets are scarce. To address these challenges, we introduce LENS, a framework that aligns multimodal sensing data with language models to generate clinically grounded mental-health narratives. LENS first constructs a large-scale dataset by transforming Ecological Momentary Assessment (EMA) responses related to depression and anxiety symptoms into natural-language descriptions, yielding over 100,000 sensor-text QA pairs from 258 participants. To enable native time-series integration, we train a patch-level encoder that projects raw sensor signals directly into an LLM's representation space. Our results show that LENS outperforms strong baselines on standard NLP metrics and task-specific measures of symptom-severity accuracy. A user study with 13 mental-health professionals further indicates that LENS-produced narratives are comprehensive and clinically meaningful. Ultimately, our approach advances LLMs as interfaces for health sensing, providing a scalable path toward models that can reason over raw behavioral signals and support downstream clinical decision-making.

LGDec 25, 2025
MotionTeller: Multi-modal Integration of Wearable Time-Series with LLMs for Health and Behavioral Understanding

Aiwei Zhang, Arvind Pillai, Andrew Campbell et al.

As wearable sensing becomes increasingly pervasive, a key challenge remains: how can we generate natural language summaries from raw physiological signals such as actigraphy - minute-level movement data collected via accelerometers? In this work, we introduce MotionTeller, a generative framework that natively integrates minute-level wearable activity data with large language models (LLMs). MotionTeller combines a pretrained actigraphy encoder with a lightweight projection module that maps behavioral embeddings into the token space of a frozen decoder-only LLM, enabling free-text, autoregressive generation of daily behavioral summaries. We construct a novel dataset of 54383 (actigraphy, text) pairs derived from real-world NHANES recordings, and train the model using cross-entropy loss with supervision only on the language tokens. MotionTeller achieves high semantic fidelity (BERTScore-F1 = 0.924) and lexical accuracy (ROUGE-1 = 0.722), outperforming prompt-based baselines by 7 percent in ROUGE-1. The average training loss converges to 0.38 by epoch 15, indicating stable optimization. Qualitative analysis confirms that MotionTeller captures circadian structure and behavioral transitions, while PCA plots reveal enhanced cluster alignment in embedding space post-training. Together, these results position MotionTeller as a scalable, interpretable system for transforming wearable sensor data into fluent, human-centered descriptions, introducing new pathways for behavioral monitoring, clinical review, and personalized health interventions.

LGMay 14
Toward World Modeling of Physiological Signals with Chaos-Theoretic Balancing and Latent Dynamics

Yunfei Luo, Xi Chen, Yuliang Chen et al.

Physiological time series signals reflect complex, multi-scale dynamical processes of the human body. Existing modeling studies focus on static tasks such as classification, event forecasting, or short-horizon next step prediction, while long-horizon signal-level forecasting and predictive nature of physiological signals remain underexplored. We introduce NormWear-2, a world model that encodes both multivariate physiological signals and clinical intervention variables into a shared latent space and models their joint temporal evolution as a dynamical system. Our approach combines inference from prior pre-trained knowledge (intuition) with instant non-parametric latent state transition adaptation (insight), enabling coherent forecasting across multiple temporal scales, conditioned on heterogeneous clinical interventions. During the pretraining phase, we find that chaos-theoretic balancing of dynamical regime diversity yields more robust representations, with a smaller balanced corpus outperforming one twice its size and capturing bifurcation regimes. We evaluate the world model performance across diverse real-world physiological datasets spanning heterogeneous temporal resolutions and intervention regimes, covering daily life, point-of-care, and clinical settings, including fitness planning, hemodialysis, diabetes management, and surgical monitoring. These evaluation datasets comprise records from 8,026 subjects, spanning study durations from 3.2 hours for high-resolution signal data to 2.3 years for longitudinal clinical biomarker tracking. NormWear-2 achieves the best overall forecasting performance across time, frequency, and latent representation domains, with significant improvements over state-of-the-art time series foundation models, while maintaining competitive downstream representation quality, providing a step toward general-purpose world models for physiological signals.

MLFeb 7, 2024
Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design

Andrew Campbell, Jason Yim, Regina Barzilay et al.

Combining discrete and continuous data is an important capability for generative models. We present Discrete Flow Models (DFMs), a new flow-based model of discrete data that provides the missing link in enabling flow-based generative models to be applied to multimodal continuous and discrete data problems. Our key insight is that the discrete equivalent of continuous space flow matching can be realized using Continuous Time Markov Chains. DFMs benefit from a simple derivation that includes discrete diffusion models as a specific instance while allowing improved performance over existing diffusion-based approaches. We utilize our DFMs method to build a multimodal flow-based modeling framework. We apply this capability to the task of protein co-design, wherein we learn a model for jointly generating protein structure and sequence. Our approach achieves state-of-the-art co-design performance while allowing the same multimodal model to be used for flexible generation of the sequence or structure.

SYMay 4
Differentially Private Synthetic Voltage Phasor Release for Distribution Grids

Andrew Campbell, Chenyue Zhang, Anna Scaglione et al.

Training machine learning models, including Grid Foundation Models (GFMs), requires large volumes of realistic grid data, yet substantial privacy concerns discourage utilities and data providers from sharing load profiles and network parameters. We study the release of synthetic voltage phasor trajectories for distribution grids under differential privacy (DP). We first fit a DP generative model to historical customer loads, then propagate synthetic load trajectories through the AC power flow equations on the true admittance matrix to produce voltage phasors. The central question is whether the randomness already present in the DP synthetic loads is sufficient to protect not only the loads, but also the network topology encoded by the bus admittance matrix. We show that it is. The implication is that a corpus of voltage trajectories can be constructed from DP synthetic loads while preserving the statistics of AC power flow, which is critical for training GFMs. This preservation of the power flow statistics stands in contrast to approaches that perturb the admittance matrix directly or inject noise into the voltage outputs, both of which distort the underlying physics. Concretely, we derive $(\varepsilon,δ)$-DP guarantees for the released voltage trajectories with respect to the admittance matrix, meaning privacy of the network parameters is obtained without any additional noise mechanism. Our bound depends on the adjacency assumption, the Jacobian of the AC power flow, and the covariance of the synthetic DP-loads. Finally, we present a synthetic voltage generation procedure and an empirical evaluation against Gaussian output-perturbation baselines, demonstrating that our approach provides a clear advantage for enabling GFM training.

LGNov 9, 2023
Compressed and Sparse Models for Non-Convex Decentralized Learning

Andrew Campbell, Hang Liu, Leah Woldemariam et al.

Recent research highlights frequent model communication as a significant bottleneck to the efficiency of decentralized machine learning (ML), especially for large-scale and over-parameterized neural networks (NNs). To address this, we present Malcom-PSGD, a novel decentralized ML algorithm that combines gradient compression techniques with model sparsification. We promote model sparsity by adding $\ell_1$ regularization to the objective and present a decentralized proximal SGD method for training. Our approach employs vector source coding and dithering-based quantization for the compressed gradient communication of sparsified models. Our analysis demonstrates that Malcom-PSGD achieves a convergence rate of $\mathcal{O}(1/\sqrt{t})$ with respect to the iterations $t$, assuming a constant consensus and learning rate. This result is supported by our proof for the convergence of non-convex compressed Proximal SGD methods. Additionally, we conduct a bit analysis, providing a closed-form expression for the communication costs associated with Malcom-PSGD. Numerical results verify our theoretical findings and demonstrate that our method reduces communication costs by approximately $75\%$ when compared to the state-of-the-art.

MLDec 10, 2024
Score-Optimal Diffusion Schedules

Christopher Williams, Andrew Campbell, Arnaud Doucet et al.

Denoising diffusion models (DDMs) offer a flexible framework for sampling from high dimensional data distributions. DDMs generate a path of probability distributions interpolating between a reference Gaussian distribution and a data distribution by incrementally injecting noise into the data. To numerically simulate the sampling process, a discretisation schedule from the reference back towards clean data must be chosen. An appropriate discretisation schedule is crucial to obtain high quality samples. However, beyond hand crafted heuristics, a general method for choosing this schedule remains elusive. This paper presents a novel algorithm for adaptively selecting an optimal discretisation schedule with respect to a cost that we derive. Our cost measures the work done by the simulation procedure to transport samples from one point in the diffusion path to the next. Our method does not require hyperparameter tuning and adapts to the dynamics and geometry of the diffusion path. Our algorithm only involves the evaluation of the estimated Stein score, making it scalable to existing pre-trained models at inference time and online during training. We find that our learned schedule recovers performant schedules previously only discovered through manual search and obtains competitive FID scores on image datasets.

MLOct 4, 2025
Self-Speculative Masked Diffusions

Andrew Campbell, Valentin De Bortoli, Jiaxin Shi et al.

We present self-speculative masked diffusions, a new class of masked diffusion generative models for discrete data that require significantly fewer function evaluations to generate samples. Standard masked diffusion models predict factorized logits over currently masked positions. A number of masked positions are then sampled, however, the factorization approximation means that sampling too many positions in one go leads to poor sample quality. As a result, many simulation steps and therefore neural network function evaluations are required to generate high-quality data. We reduce the computational burden by generating non-factorized predictions over masked positions. This is achieved by modifying the final transformer attention mask from non-causal to causal, enabling draft token generation and parallel validation via a novel, model-integrated speculative sampling mechanism. This results in a non-factorized predictive distribution over masked positions in a single forward pass. We apply our method to GPT2 scale text modelling and protein sequences generation, finding that we can achieve a ~2x reduction in the required number of network forward passes relative to standard masked diffusion models.

LGFeb 11, 2025
Time2Lang: Bridging Time-Series Foundation Models and Large Language Models for Health Sensing Beyond Prompting

Arvind Pillai, Dimitris Spathis, Subigya Nepal et al.

Large language models (LLMs) show promise for health applications when combined with behavioral sensing data. Traditional approaches convert sensor data into text prompts, but this process is prone to errors, computationally expensive, and requires domain expertise. These challenges are particularly acute when processing extended time series data. While time series foundation models (TFMs) have recently emerged as powerful tools for learning representations from temporal data, bridging TFMs and LLMs remains challenging. Here, we present Time2Lang, a framework that directly maps TFM outputs to LLM representations without intermediate text conversion. Our approach first trains on synthetic data using periodicity prediction as a pretext task, followed by evaluation on mental health classification tasks. We validate Time2Lang on two longitudinal wearable and mobile sensing datasets: daily depression prediction using step count data (17,251 days from 256 participants) and flourishing classification based on conversation duration (46 participants over 10 weeks). Time2Lang maintains near constant inference times regardless of input length, unlike traditional prompting methods. The generated embeddings preserve essential time-series characteristics such as auto-correlation. Our results demonstrate that TFMs and LLMs can be effectively integrated while minimizing information loss and enabling performance transfer across these distinct modeling paradigms. To our knowledge, we are the first to integrate a TFM and an LLM for health, thus establishing a foundation for future research combining general-purpose large models for complex healthcare tasks.

LGJul 30, 2025
Decentralized Differentially Private Power Method

Andrew Campbell, Anna Scaglione, Sean Peisert

We propose a novel Decentralized Differentially Private Power Method (D-DP-PM) for performing Principal Component Analysis (PCA) in networked multi-agent settings. Unlike conventional decentralized PCA approaches where each agent accesses the full n-dimensional sample space, we address the challenging scenario where each agent observes only a subset of dimensions through row-wise data partitioning. Our method ensures $(ε,δ)$-Differential Privacy (DP) while enabling collaborative estimation of global eigenvectors across the network without requiring a central aggregator. We achieve this by having agents share only local embeddings of the current eigenvector iterate, leveraging both the inherent privacy from random initialization and carefully calibrated Gaussian noise additions. We prove that our algorithm satisfies the prescribed $(ε,δ)$-DP guarantee and establish convergence rates that explicitly characterize the impact of the network topology. Our theoretical analysis, based on linear dynamics and high-dimensional probability theory, provides tight bounds on both privacy and utility. Experiments on real-world datasets demonstrate that D-DP-PM achieves superior privacy-utility tradeoffs compared to naive local DP approaches, with particularly strong performance in moderate privacy regimes ($ε\in[2, 5]$). The method converges rapidly, allowing practitioners to trade iterations for enhanced privacy while maintaining competitive utility.

LGApr 8, 2025
A Self-Supervised Framework for Space Object Behaviour Characterisation

Ian Groves, Andrew Campbell, James Fernandes et al.

Foundation Models, pre-trained on large unlabelled datasets before task-specific fine-tuning, are increasingly being applied to specialised domains. Recent examples include ClimaX for climate and Clay for satellite Earth observation, but a Foundation Model for Space Object Behavioural Analysis has not yet been developed. As orbital populations grow, automated methods for characterising space object behaviour are crucial for space safety. We present a Space Safety and Sustainability Foundation Model focusing on space object behavioural analysis using light curves (LCs). We implemented a Perceiver-Variational Autoencoder (VAE) architecture, pre-trained with self-supervised reconstruction and masked reconstruction on 227,000 LCs from the MMT-9 observatory. The VAE enables anomaly detection, motion prediction, and LC generation. We fine-tuned the model for anomaly detection & motion prediction using two independent LC simulators (CASSANDRA and GRIAL respectively), using CAD models of boxwing, Sentinel-3, SMOS, and Starlink platforms. Our pre-trained model achieved a reconstruction error of 0.01%, identifying potentially anomalous light curves through reconstruction difficulty. After fine-tuning, the model scored 88% and 82% accuracy, with 0.90 and 0.95 ROC AUC scores respectively in both anomaly detection and motion mode prediction (sun-pointing, spin, etc.). Analysis of high-confidence anomaly predictions on real data revealed distinct patterns including characteristic object profiles and satellite glinting. Here, we demonstrate how self-supervised learning can simultaneously enable anomaly detection, motion prediction, and synthetic data generation from rich representations learned in pre-training. Our work therefore supports space safety and sustainability through automated monitoring and simulation capabilities.

SPDec 19, 2023
Hyperspectral Lightcurve Inversion for Attitude Determination

Simão da Graça Marto, Massimiliano Vasile, Andrew Campbell et al.

Spectral lightcurves consisting of time series single-pixel spectral measurements of spacecraft are used to infer the spacecraft's attitude and rotation. Two methods are used. One based on numerical optimisation of a regularised least squares cost function, and another based on machine learning with a neural network model. The aim is to work with minimal information, thus no prior is available on the attitude nor on the inertia tensor. The theoretical and practical aspects of this task are investigated, and the methodology is tested on synthetic data. Results are shown based on synthetic data.

LGMay 31, 2023
Rare Life Event Detection via Mobile Sensing Using Multi-Task Learning

Arvind Pillai, Subigya Nepal, Andrew Campbell

Rare life events significantly impact mental health, and their detection in behavioral studies is a crucial step towards health-based interventions. We envision that mobile sensing data can be used to detect these anomalies. However, the human-centered nature of the problem, combined with the infrequency and uniqueness of these events makes it challenging for unsupervised machine learning methods. In this paper, we first investigate granger-causality between life events and human behavior using sensing data. Next, we propose a multi-task framework with an unsupervised autoencoder to capture irregular behavior, and an auxiliary sequence predictor that identifies transitions in workplace performance to contextualize events. We perform experiments using data from a mobile sensing study comprising N=126 information workers from multiple industries, spanning 10106 days with 198 rare events (<2%). Through personalized inference, we detect the exact day of a rare event with an F1 of 0.34, demonstrating that our method outperforms several baselines. Finally, we discuss the implications of our work from the context of real-world deployment.

MLMay 25, 2023
Trans-Dimensional Generative Modeling via Jump Diffusion Models

Andrew Campbell, William Harvey, Christian Weilbach et al.

We propose a new class of generative models that naturally handle data of varying dimensionality by jointly modeling the state and dimension of each datapoint. The generative process is formulated as a jump diffusion process that makes jumps between different dimensional spaces. We first define a dimension destroying forward noising process, before deriving the dimension creating time-reversed generative process along with a novel evidence lower bound training objective for learning to approximate it. Simulating our learned approximation to the time-reversed generative process then provides an effective way of sampling data of varying dimensionality by jointly generating state values and dimensions. We demonstrate our approach on molecular and video datasets of varying dimensionality, reporting better compatibility with test-time diffusion guidance imputation tasks and improved interpolation capabilities versus fixed dimensional models that generate state values and dimensions separately.

MLOct 26, 2021
Online Variational Filtering and Parameter Learning

Andrew Campbell, Yuyang Shi, Tom Rainforth et al.

We present a variational method for online state estimation and parameter learning in state-space models (SSMs), a ubiquitous class of latent variable models for sequential data. As per standard batch variational techniques, we use stochastic gradients to simultaneously optimize a lower bound on the log evidence with respect to both model parameters and a variational approximation of the states' posterior distribution. However, unlike existing approaches, our method is able to operate in an entirely online manner, such that historic observations do not require revisitation after being incorporated and the cost of updates at each time step remains constant, despite the growing dimensionality of the joint posterior distribution of the states. This is achieved by utilizing backward decompositions of this joint posterior distribution and of its variational approximation, combined with Bellman-type recursions for the evidence lower bound and its gradients. We demonstrate the performance of this methodology across several examples, including high-dimensional SSMs and sequential Variational Auto-Encoders.

SPJun 25, 2021
Patient-independent Schizophrenia Relapse Prediction Using Mobile Sensor based Daily Behavioral Rhythm Changes

Bishal Lamichhane, Dror Ben-Zeev, Andrew Campbell et al.

A schizophrenia relapse has severe consequences for a patient's health, work, and sometimes even life safety. If an oncoming relapse can be predicted on time, for example by detecting early behavioral changes in patients, then interventions could be provided to prevent the relapse. In this work, we investigated a machine learning based schizophrenia relapse prediction model using mobile sensing data to characterize behavioral features. A patient-independent model providing sequential predictions, closely representing the clinical deployment scenario for relapse prediction, was evaluated. The model uses the mobile sensing data from the recent four weeks to predict an oncoming relapse in the next week. We used the behavioral rhythm features extracted from daily templates of mobile sensing data, self-reported symptoms collected via EMA (Ecological Momentary Assessment), and demographics to compare different classifiers for the relapse prediction. Naive Bayes based model gave the best results with an F2 score of 0.083 when evaluated in a dataset consisting of 63 schizophrenia patients, each monitored for up to a year. The obtained F2 score, though low, is better than the baseline performance of random classification (F2 score of 0.02 $\pm$ 0.024). Thus, mobile sensing has predictive value for detecting an oncoming relapse and needs further investigation to improve the current performance. Towards that end, further feature engineering and model personalization based on the behavioral idiosyncrasies of a patient could be helpful.

LGJun 22, 2021
Routine Clustering of Mobile Sensor Data Facilitates Psychotic Relapse Prediction in Schizophrenia Patients

Joanne Zhou, Bishal Lamichhane, Dror Ben-Zeev et al.

We aim to develop clustering models to obtain behavioral representations from continuous multimodal mobile sensing data towards relapse prediction tasks. The identified clusters could represent different routine behavioral trends related to daily living of patients as well as atypical behavioral trends associated with impending relapse. We used the mobile sensing data obtained in the CrossCheck project for our analysis. Continuous data from six different mobile sensing-based modalities (e.g. ambient light, sound/conversation, acceleration etc.) obtained from a total of 63 schizophrenia patients, each monitored for up to a year, were used for the clustering models and relapse prediction evaluation. Two clustering models, Gaussian Mixture Model (GMM) and Partition Around Medoids (PAM), were used to obtain behavioral representations from the mobile sensing data. The features obtained from the clustering models were used to train and evaluate a personalized relapse prediction model using Balanced Random Forest. The personalization was done by identifying optimal features for a given patient based on a personalization subset consisting of other patients who are of similar age. The clusters identified using the GMM and PAM models were found to represent different behavioral patterns (such as clusters representing sedentary days, active but with low communications days, etc.). Significant changes near the relapse periods were seen in the obtained behavioral representation features from the clustering models. The clustering model based features, together with other features characterizing the mobile sensing data, resulted in an F2 score of 0.24 for the relapse prediction task in a leave-one-patient-out evaluation setting. This obtained F2 score is significantly higher than a random classification baseline with an average F2 score of 0.042.