Russell Greiner

LG
h-index32
49papers
1,664citations
Novelty49%
AI Score55

49 Papers

LGJun 1, 2023
An Effective Meaningful Way to Evaluate Survival Models

Shi-ang Qi, Neeraj Kumar, Mahtab Farrokh et al.

One straightforward metric to evaluate a survival prediction model is based on the Mean Absolute Error (MAE) -- the average of the absolute difference between the time predicted by the model and the true event time, over all subjects. Unfortunately, this is challenging because, in practice, the test set includes (right) censored individuals, meaning we do not know when a censored individual actually experienced the event. In this paper, we explore various metrics to estimate MAE for survival datasets that include (many) censored individuals. Moreover, we introduce a novel and effective approach for generating realistic semi-synthetic survival datasets to facilitate the evaluation of metrics. Our findings, based on the analysis of the semi-synthetic datasets, reveal that our proposed metric (MAE using pseudo-observations) is able to rank models accurately based on their performance, and often closely matches the true MAE -- in particular, is better than several alternative methods.

LGJun 20, 2023
Copula-Based Deep Survival Models for Dependent Censoring

Ali Hossein Gharari Foomani, Michael Cooper, Russell Greiner et al.

A survival dataset describes a set of instances (e.g. patients) and provides, for each, either the time until an event (e.g. death), or the censoring time (e.g. when lost to follow-up - which is a lower bound on the time until the event). We consider the challenge of survival prediction: learning, from such data, a predictive model that can produce an individual survival distribution for a novel instance. Many contemporary methods of survival prediction implicitly assume that the event and censoring distributions are independent conditional on the instance's covariates - a strong assumption that is difficult to verify (as we observe only one outcome for each instance) and which can induce significant bias when it does not hold. This paper presents a parametric model of survival that extends modern non-linear survival analysis by relaxing the assumption of conditional independence. On synthetic and semi-synthetic data, our approach significantly improves estimates of survival distributions compared to the standard that assumes conditional independence in the data.

SPOct 6, 2022
ECG for high-throughput screening of multiple diseases: Proof-of-concept using multi-diagnosis deep learning from population-based datasets

Weijie Sun, Sunil Vasu Kalmady, Amir Salimi et al.

Electrocardiogram (ECG) abnormalities are linked to cardiovascular diseases, but may also occur in other non-cardiovascular conditions such as mental, neurological, metabolic and infectious conditions. However, most of the recent success of deep learning (DL) based diagnostic predictions in selected patient cohorts have been limited to a small set of cardiac diseases. In this study, we use a population-based dataset of >250,000 patients with >1000 medical conditions and >2 million ECGs to identify a wide range of diseases that could be accurately diagnosed from the patient's first in-hospital ECG. Our DL models uncovered 128 diseases and 68 disease categories with strong discriminative performance.

SPNov 14, 2022
Improving ECG-based COVID-19 diagnosis and mortality predictions using pre-pandemic medical records at population-scale

Weijie Sun, Sunil Vasu Kalmady, Nariman Sepehrvand et al.

Pandemic outbreaks such as COVID-19 occur unexpectedly, and need immediate action due to their potential devastating consequences on global health. Point-of-care routine assessments such as electrocardiogram (ECG), can be used to develop prediction models for identifying individuals at risk. However, there is often too little clinically-annotated medical data, especially in early phases of a pandemic, to develop accurate prediction models. In such situations, historical pre-pandemic health records can be utilized to estimate a preliminary model, which can then be fine-tuned based on limited available pandemic data. This study shows this approach -- pre-train deep learning models with pre-pandemic data -- can work effectively, by demonstrating substantial performance improvement over three different COVID-19 related diagnostic and prognostic prediction tasks. Similar transfer learning strategies can be useful for developing timely artificial intelligence solutions in future pandemic outbreaks.

LGSep 10, 2024Code
MENSA: A Multi-Event Network for Survival Analysis with Trajectory-based Likelihood Estimation

Christian Marius Lillelund, Ali Hossein Gharari Foomani, Weijie Sun et al.

Most existing time-to-event methods focus on either single-event or competing-risks settings, leaving multi-event scenarios relatively underexplored. In many healthcare applications, for example, a patient may experience multiple clinical events, that can be non-exclusive and semi-competing. A common workaround is to train independent single-event models for such multi-event problems, but this approach fails to exploit dependencies and shared structures across events. To overcome these limitations, we propose MENSA (Multi-Event Network for Survival Analysis), a deep learning model that jointly learns flexible time-to-event distributions for multiple events, whether competing or co-occurring. In addition, we introduce a novel trajectory-based likelihood term that captures the temporal ordering between events. Across four multi-event datasets, MENSA improves predictive performance over many state-of-the-art baselines. Source code is available at https://github.com/thecml/mensa.

LGMay 15Code
SurvivalPFN: Amortizing Survival Prediction via In-Context Bayesian Inference

Shi-ang Qi, Vahid Balazadeh, Michael Cooper et al.

Survival analysis provides a powerful statistical framework for modeling time-to-event outcomes in the presence of censoring. However, selecting an appropriate estimator from the many specialized survival approaches often requires substantial methodological and domain expertise. We introduce SurvivalPFN, a prior-data fitted network that amortizes Bayesian inference for censored observations through in-context learning. SurvivalPFN is pretrained on a diverse family of synthetic, identifiable, and right-censored data-generating processes, enabling it to amortize survival analysis in a single forward pass during inference. As a result, the model adapts to the effective complexity of each dataset without task-specific training or hyperparameter tuning, avoids restrictive parametric assumptions, and produces calibrated survival distributions. In a large-scale benchmark spanning 61 datasets, 21 methods, and 5 evaluation metrics, SurvivalPFN achieves strong predictive performance and often improves upon established survival models. These results suggest that SurvivalPFN offers a principled and practical foundation model for survival analysis, with potential applications in high-impact domains such as healthcare, finance, and engineering (https://github.com/rgklab/SurvivalPFN).

LGFeb 9, 2023
Modeling and Forecasting COVID-19 Cases using Latent Subpopulations

Roberto Vega, Zehra Shah, Pouria Ramazi et al.

Classical epidemiological models assume homogeneous populations. There have been important extensions to model heterogeneous populations, when the identity of the sub-populations is known, such as age group or geographical location. Here, we propose two new methods to model the number of people infected with COVID-19 over time, each as a linear combination of latent sub-populations -- i.e., when we do not know which person is in which sub-population, and the only available observations are the aggregates across all sub-populations. Method #1 is a dictionary-based approach, which begins with a large number of pre-defined sub-population models (each with its own starting time, shape, etc), then determines the (positive) weight of small (learned) number of sub-populations. Method #2 is a mixture-of-$M$ fittable curves, where $M$, the number of sub-populations to use, is given by the user. Both methods are compatible with any parametric model; here we demonstrate their use with first (a)~Gaussian curves and then (b)~SIR trajectories. We empirically show the performance of the proposed methods, first in (i) modeling the observed data and then in (ii) forecasting the number of infected people 1 to 4 weeks in advance. Across 187 countries, we show that the dictionary approach had the lowest mean absolute percentage error and also the lowest variance when compared with classical SIR models and moreover, it was a strong baseline that outperforms many of the models developed for COVID-19 forecasting.

QMOct 30, 2024Code
MassSpecGym: A benchmark for the discovery and identification of molecules

Roman Bushuiev, Anton Bushuiev, Niek F. de Jonge et al.

The discovery and identification of molecules in biological and environmental samples is crucial for advancing biomedical and chemical sciences. Tandem mass spectrometry (MS/MS) is the leading technique for high-throughput elucidation of molecular structures. However, decoding a molecular structure from its mass spectrum is exceptionally challenging, even when performed by human experts. As a result, the vast majority of acquired MS/MS spectra remain uninterpreted, thereby limiting our understanding of the underlying (bio)chemical processes. Despite decades of progress in machine learning applications for predicting molecular structures from MS/MS spectra, the development of new methods is severely hindered by the lack of standard datasets and evaluation protocols. To address this problem, we propose MassSpecGym -- the first comprehensive benchmark for the discovery and identification of molecules from MS/MS data. Our benchmark comprises the largest publicly available collection of high-quality labeled MS/MS spectra and defines three MS/MS annotation challenges: de novo molecular structure generation, molecule retrieval, and spectrum simulation. It includes new evaluation metrics and a generalization-demanding data split, therefore standardizing the MS/MS annotation tasks and rendering the problem accessible to the broad machine learning community. MassSpecGym is publicly available at https://github.com/pluskal-lab/MassSpecGym.

SPApr 26, 2024Code
Federated Learning and Differential Privacy Techniques on Multi-hospital Population-scale Electrocardiogram Data

Vikhyat Agrawal, Sunil Vasu Kalmady, Venkataseetharam Manoj Malipeddi et al.

This research paper explores ways to apply Federated Learning (FL) and Differential Privacy (DP) techniques to population-scale Electrocardiogram (ECG) data. The study learns a multi-label ECG classification model using FL and DP based on 1,565,849 ECG tracings from 7 hospitals in Alberta, Canada. The FL approach allowed collaborative model training without sharing raw data between hospitals while building robust ECG classification models for diagnosing various cardiac conditions. These accurate ECG classification models can facilitate the diagnoses while preserving patient confidentiality using FL and DP techniques. Our results show that the performance achieved using our implementation of the FL approach is comparable to that of the pooled approach, where the model is trained over the aggregating data from all hospitals. Furthermore, our findings suggest that hospitals with limited ECGs for training can benefit from adopting the FL model compared to single-site training. In addition, this study showcases the trade-off between model performance and data privacy by employing DP during model training. Our code is available at https://github.com/vikhyatt/Hospital-FL-DP.

SPMar 17, 2025Code
Survival Analysis with Machine Learning for Predicting Li-ion Battery Remaining Useful Life

Jingyuan Xue, Longfei Wei, Dongjing Jiang et al.

Battery degradation significantly impacts the reliability and efficiency of energy storage systems, particularly in electric vehicles and industrial applications. Predicting the remaining useful life (RUL) of lithium-ion batteries is crucial for optimizing maintenance schedules, reducing costs, and improving safety. Traditional RUL prediction methods often struggle with nonlinear degradation patterns and uncertainty quantification. To address these challenges, we propose a hybrid survival analysis framework integrating survival data reconstruction, survival model learning, and survival probability estimation. Our approach transforms battery voltage time series into time-to-failure data using path signatures. The multiple Cox-based survival models and machine-learning-based methods, such as DeepHit and MTLR, are learned to predict battery failure-free probabilities over time. Experiments conducted on the Toyota battery and NASA battery datasets demonstrate the effectiveness of our approach, achieving high time-dependent AUC and concordance index (C-Index) while maintaining a low integrated Brier score. The data and source codes for this work are available to the public at https://github.com/thinkxca/rul.

LGJun 25, 2019Code
Simultaneous Prediction Intervals for Patient-Specific Survival Curves

Samuel Sokota, Ryan D'Orazio, Khurram Javed et al.

Accurate models of patient survival probabilities provide important information to clinicians prescribing care for life-threatening and terminal ailments. A recently developed class of models - known as individual survival distributions (ISDs) - produces patient-specific survival functions that offer greater descriptive power of patient outcomes than was previously possible. Unfortunately, at the time of writing, ISD models almost universally lack uncertainty quantification. In this paper, we demonstrate that an existing method for estimating simultaneous prediction intervals from samples can easily be adapted for patient-specific survival curve analysis and yields accurate results. Furthermore, we introduce both a modification to the existing method and a novel method for estimating simultaneous prediction intervals and show that they offer competitive performance. It is worth emphasizing that these methods are not limited to survival analysis and can be applied in any context in which sampling the distribution of interest is tractable. Code is available at https://github.com/ssokota/spie .

LGMay 12, 2024
Conformalized Survival Distributions: A Generic Post-Process to Increase Calibration

Shi-ang Qi, Yakun Yu, Russell Greiner

Discrimination and calibration represent two important properties of survival analysis, with the former assessing the model's ability to accurately rank subjects and the latter evaluating the alignment of predicted outcomes with actual events. With their distinct nature, it is hard for survival models to simultaneously optimize both of them especially as many previous results found improving calibration tends to diminish discrimination performance. This paper introduces a novel approach utilizing conformal regression that can improve a model's calibration without degrading discrimination. We provide theoretical guarantees for the above claim, and rigorously validate the efficiency of our approach across 11 real-world datasets, showcasing its practical applicability and robustness in diverse scenarios.

LGApr 2, 2024
FraGNNet: A Deep Probabilistic Model for Tandem Mass Spectrum Prediction

Adamo Young, Fei Wang, David S Wishart et al.

Compound identification from tandem mass spectrometry (MS/MS) data is a critical step in the analysis of complex mixtures. Typical solutions for the MS/MS spectrum to compound (MS2C) problem involve comparing the unknown spectrum against a library of known spectrum-molecule pairs, an approach that is limited by incomplete library coverage. Compound to MS/MS spectrum (C2MS) models can improve retrieval rates by augmenting real libraries with predicted MS/MS spectra. Unfortunately, many existing C2MS models suffer from problems with mass accuracy, generalization, or interpretability. We develop a new probabilistic method for C2MS prediction, FraGNNet, that can efficiently and accurately simulate MS/MS spectra with high mass accuracy. Our approach formulates the C2MS problem as learning a distribution over molecule fragments. FraGNNet achieves state-of-the-art performance in terms of prediction error and surpasses existing C2MS models as a tool for retrieval-based MS2C.

LGMar 24, 2024
An early warning indicator trained on stochastic disease-spreading models with different noises

Amit K. Chakraborty, Shan Gao, Reza Miry et al.

The timely detection of disease outbreaks through reliable early warning signals (EWSs) is indispensable for effective public health mitigation strategies. Nevertheless, the intricate dynamics of real-world disease spread, often influenced by diverse sources of noise and limited data in the early stages of outbreaks, pose a significant challenge in developing reliable EWSs, as the performance of existing indicators varies with extrinsic and intrinsic noises. Here, we address the challenge of modeling disease when the measurements are corrupted by additive white noise, multiplicative environmental noise, and demographic noise into a standard epidemic mathematical model. To navigate the complexities introduced by these noise sources, we employ a deep learning algorithm that provides EWS in infectious disease outbreak by training on noise-induced disease-spreading models. The indicator's effectiveness is demonstrated through its application to real-world COVID-19 cases in Edmonton and simulated time series derived from diverse disease spread models affected by noise. Notably, the indicator captures an impending transition in a time series of disease outbreaks and outperforms existing indicators. This study contributes to advancing early warning capabilities by addressing the intricate dynamics inherent in real-world disease spread, presenting a promising avenue for enhancing public health preparedness and response efforts.

MLFeb 26, 2025
Overcoming Dependent Censoring in the Evaluation of Survival Models

Christian Marius Lillelund, Shi-ang Qi, Russell Greiner

Conventional survival metrics, such as Harrell's concordance index (CI) and the Brier Score, rely on the independent censoring assumption for valid inference with right-censored data. However, in the presence of so-called dependent censoring, where the probability of censoring is related to the event of interest, these metrics can give biased estimates of the underlying model error. In this paper, we introduce three new evaluation metrics for survival analysis based on Archimedean copulas that can account for dependent censoring. We also develop a framework to generate realistic, semi-synthetic datasets with dependent censoring to facilitate the evaluation of the metrics. Our experiments in synthetic and semi-synthetic data demonstrate that the proposed metrics can provide more accurate estimates of the model error than conventional metrics under dependent censoring.

LGJan 14, 2025
Deep Learning for Disease Outbreak Prediction: A Robust Early Warning Signal for Transcritical Bifurcations

Reza Miry, Amit K. Chakraborty, Russell Greiner et al.

Early Warning Signals (EWSs) are vital for implementing preventive measures before a disease turns into a pandemic. While new diseases exhibit unique behaviors, they often share fundamental characteristics from a dynamical systems perspective. Moreover, measurements during disease outbreaks are often corrupted by different noise sources, posing challenges for Time Series Classification (TSC) tasks. In this study, we address the problem of having a robust EWS for disease outbreak prediction using a best-performing deep learning model in the domain of TSC. We employed two simulated datasets to train the model: one representing generated dynamical systems with randomly selected polynomial terms to model new disease behaviors, and another simulating noise-induced disease dynamics to account for noisy measurements. The model's performance was analyzed using both simulated data from different disease models and real-world data, including influenza and COVID-19. Results demonstrate that the proposed model outperforms previous models, effectively providing EWSs of impending outbreaks across various scenarios. This study bridges advancements in deep learning with the ability to provide robust early warning signals in noisy environments, making it highly applicable to real-world crises involving emerging disease outbreaks.

LGOct 27, 2024
Toward Conditional Distribution Calibration in Survival Prediction

Shi-ang Qi, Yakun Yu, Russell Greiner

Survival prediction often involves estimating the time-to-event distribution from censored datasets. Previous approaches have focused on enhancing discrimination and marginal calibration. In this paper, we highlight the significance of conditional calibration for real-world applications -- especially its role in individual decision-making. We propose a method based on conformal prediction that uses the model's predicted individual survival probability at that instance's observed time. This method effectively improves the model's marginal and conditional calibration, without compromising discrimination. We provide asymptotic theoretical guarantees for both marginal and conditional calibration and test it extensively across 15 diverse real-world datasets, demonstrating the method's practical effectiveness and versatility in various settings.

LGApr 13, 2024
Early detection of disease outbreaks and non-outbreaks using incidence data

Shan Gao, Amit K. Chakraborty, Russell Greiner et al.

Forecasting the occurrence and absence of novel disease outbreaks is essential for disease management. Here, we develop a general model, with no real-world training data, that accurately forecasts outbreaks and non-outbreaks. We propose a novel framework, using a feature-based time series classification method to forecast outbreaks and non-outbreaks. We tested our methods on synthetic data from a Susceptible-Infected-Recovered model for slowly changing, noisy disease dynamics. Outbreak sequences give a transcritical bifurcation within a specified future time window, whereas non-outbreak (null bifurcation) sequences do not. We identified incipient differences in time series of infectives leading to future outbreaks and non-outbreaks. These differences are reflected in 22 statistical features and 5 early warning signal indicators. Classifier performance, given by the area under the receiver-operating curve, ranged from 0.99 for large expanding windows of training data to 0.7 for small rolling windows. Real-world performances of classifiers were tested on two empirical datasets, COVID-19 data from Singapore and SARS data from Hong Kong, with two classifiers exhibiting high accuracy. In summary, we showed that there are statistical features that distinguish outbreak and non-outbreak sequences long before outbreaks occur. We could detect these differences in synthetic and real-world data sets, well before potential outbreaks occur.

LGOct 14, 2025
Budget-constrained Active Learning to Effectively De-censor Survival Data

Ali Parsaee, Bei Jiang, Zachary Friggstad et al.

Standard supervised learners attempt to learn a model from a labeled dataset. Given a small set of labeled instances, and a pool of unlabeled instances, a budgeted learner can use its given budget to pay to acquire the labels of some unlabeled instances, which it can then use to produce a model. Here, we explore budgeted learning in the context of survival datasets, which include (right) censored instances, where we know only a lower bound on an instance's time-to-event. Here, that learner can pay to (partially) label a censored instance -- e.g., to acquire the actual time for an instance [perhaps go from (3 yr, censored) to (7.2 yr, uncensored)], or other variants [e.g., learn about one more year, so go from (3 yr, censored) to either (4 yr, censored) or perhaps (3.2 yr, uncensored)]. This serves as a model of real world data collection, where follow-up with censored patients does not always lead to uncensoring, and how much information is given to the learner model during data collection is a function of the budget and the nature of the data itself. We provide both experimental and theoretical results for how to apply state-of-the-art budgeted learning algorithms to survival data and the respective limitations that exist in doing so. Our approach provides bounds and time complexity asymptotically equivalent to the standard active learning method BatchBALD. Moreover, empirical analysis on several survival tasks show that our model performs better than other potential approaches on several benchmarks.

QMJun 2, 2025
A meaningful prediction of functional decline in amyotrophic lateral sclerosis based on multi-event survival analysis

Christian Marius Lillelund, Sanjay Kalra, Russell Greiner

Amyotrophic lateral sclerosis (ALS) is a degenerative disorder of the motor neurons that causes progressive paralysis in patients. Current treatment options aim to prolong survival and improve quality of life. However, due to the heterogeneity of the disease, it is often difficult to determine the optimal time for potential therapies or medical interventions. In this study, we propose a novel method to predict the time until a patient with ALS experiences significant functional impairment (ALSFRS-R <= 2) for each of five common functions: speaking, swallowing, handwriting, walking, and breathing. We formulate this task as a multi-event survival problem and validate our approach in the PRO-ACT dataset (N = 3220) by training five covariate-based survival models to estimate the probability of each event over the 500 days following the baseline visit. We then predict five event-specific individual survival distributions (ISDs) for a patient, each providing an interpretable estimate of when that event is likely to occur. The results show that covariate-based models are superior to the Kaplan-Meier estimator at predicting time-to-event outcomes in the PRO-ACT dataset. Additionally, our method enables practitioners to make individual counterfactual predictions -- where certain covariates can be changed -- to estimate their effect on the predicted outcome. In this regard, we find that Riluzole has little or no impact on predicted functional decline. However, for patients with bulbar-onset ALS, our model predicts significantly shorter time-to-event estimates for loss of speech and swallowing function compared to patients with limb-onset ALS (log-rank p<0.001, Bonferroni-adjusted alpha=0.01). The proposed method can be applied to current clinical examination data to assess the risk of functional decline and thus allow more personalized treatment planning.

MEJun 2, 2025
Stop Chasing the C-index: This Is How We Should Evaluate Our Survival Models

Christian Marius Lillelund, Shi-ang Qi, Russell Greiner et al.

We argue that many survival analysis and time-to-event models are incorrectly evaluated. First, we survey many examples of evaluation approaches in the literature and find that most rely on concordance (C-index). However, the C-index only measures a model's discriminative ability and does not assess other important aspects, such as the accuracy of the time-to-event predictions or the calibration of the model's probabilistic estimates. Next, we present a set of key desiderata for choosing the right evaluation metric and discuss their pros and cons. These are tailored to the challenges in survival analysis, such as sensitivity to miscalibration and various censoring assumptions. We hypothesize that the current development of survival metrics conforms to a double-helix ladder, and that model validity and metric validity must stand on the same rung of the assumption ladder. Finally, we discuss the appropriate methods for evaluating a survival model in practice and summarize various viewpoints opposing our analysis.

LGMar 9, 2025
Censoring-Aware Tree-Based Reinforcement Learning for Estimating Dynamic Treatment Regimes with Censored Outcomes

Animesh Kumar Paul, Russell Greiner

Dynamic Treatment Regimes (DTRs) provide a systematic approach for making sequential treatment decisions that adapt to individual patient characteristics, particularly in clinical contexts where survival outcomes are of interest. Censoring-Aware Tree-Based Reinforcement Learning (CA-TRL) is a novel framework to address the complexities associated with censored data when estimating optimal DTRs. We explore ways to learn effective DTRs, from observational data. By enhancing traditional tree-based reinforcement learning methods with augmented inverse probability weighting (AIPW) and censoring-aware modifications, CA-TRL delivers robust and interpretable treatment strategies. We demonstrate its effectiveness through extensive simulations and real-world applications using the SANAD epilepsy dataset, where it outperformed the recently proposed ASCL method in key metrics such as restricted mean survival time (RMST) and decision-making accuracy. This work represents a step forward in advancing personalized and data-driven treatment strategies across diverse healthcare settings.

IVMay 29, 2023
The ACROBAT 2022 Challenge: Automatic Registration Of Breast Cancer Tissue

Philippe Weitz, Masi Valkonen, Leslie Solorzano et al.

The alignment of tissue between histopathological whole-slide-images (WSI) is crucial for research and clinical applications. Advances in computing, deep learning, and availability of large WSI datasets have revolutionised WSI analysis. Therefore, the current state-of-the-art in WSI registration is unclear. To address this, we conducted the ACROBAT challenge, based on the largest WSI registration dataset to date, including 4,212 WSIs from 1,152 breast cancer patients. The challenge objective was to align WSIs of tissue that was stained with routine diagnostic immunohistochemistry to its H&E-stained counterpart. We compare the performance of eight WSI registration algorithms, including an investigation of the impact of different WSI properties and clinical covariates. We find that conceptually distinct WSI registration methods can lead to highly accurate registration performances and identify covariates that impact performances across methods. These results establish the current state-of-the-art in WSI registration and guide researchers in selecting and developing methods.

LGJan 14, 2022
Domain-shift adaptation via linear transformations

Roberto Vega, Russell Greiner

A predictor, $f_A : X \to Y$, learned with data from a source domain (A) might not be accurate on a target domain (B) when their distributions are different. Domain adaptation aims to reduce the negative effects of this distribution mismatch. Here, we analyze the case where $P_A(Y\ |\ X) \neq P_B(Y\ |\ X)$, $P_A(X) \neq P_B(X)$ but $P_A(Y) = P_B(Y)$; where there are affine transformations of $X$ that makes all distributions equivalent. We propose an approach to project the source and target domains into a lower-dimensional, common space, by (1) projecting the domains into the eigenvectors of the empirical covariance matrices of each domain, then (2) finding an orthogonal matrix that minimizes the maximum mean discrepancy between the projections of both domains. For arbitrary affine transformations, there is an inherent unidentifiability problem when performing unsupervised domain adaptation that can be alleviated in the semi-supervised case. We show the effectiveness of our approach in simulated data and in binary digit classification tasks, obtaining improvements up to 48% accuracy when correcting for the domain shift in the data.

LGNov 11, 2021
Variational Auto-Encoder Architectures that Excel at Causal Inference

Negar Hassanpour, Russell Greiner

Estimating causal effects from observational data (at either an individual -- or a population -- level) is critical for making many types of decisions. One approach to address this task is to learn decomposed representations of the underlying factors of data; this becomes significantly more challenging when there are confounding factors (which influence both the cause and the effect). In this paper, we take a generative approach that builds on the recent advances in Variational Auto-Encoders to simultaneously learn those underlying factors as well as the causal effects. We propose a progressive sequence of models, where each improves over the previous one, culminating in the Hybrid model. Our empirical results demonstrate that the performance of all three proposed models are superior to both state-of-the-art discriminative as well as other generative approaches in the literature.

LGJun 3, 2021
SIMLR: Machine Learning inside the SIR model for COVID-19 Forecasting

Roberto Vega, Leonardo Flores, Russell Greiner

Accurate forecasts of the number of newly infected people during an epidemic are critical for making effective timely decisions. This paper addresses this challenge using the SIMLR model, which incorporates machine learning (ML) into the epidemiological SIR model. For each region, SIMLR tracks the changes in the policies implemented at the government level, which it uses to estimate the time-varying parameters of an SIR model for forecasting the number of new infections 1- to 4-weeks in advance.It also forecasts the probability of changes in those government policies at each of these future times, which is essential for the longer-range forecasts. We applied SIMLR to data from regions in Canada and in the United States,and show that its MAPE (mean average percentage error) performance is as good as SOTA forecasting models, with the added advantage of being an interpretable model. We expect that this approach will be useful not only for forecasting COVID-19 infections, but also in predicting the evolution of other infectious diseases.

CVFeb 11, 2021
Sample Efficient Learning of Image-Based Diagnostic Classifiers Using Probabilistic Labels

Roberto Vega, Pouneh Gorji, Zichen Zhang et al.

Deep learning approaches often require huge datasets to achieve good generalization. This complicates its use in tasks like image-based medical diagnosis, where the small training datasets are usually insufficient to learn appropriate data representations. For such sensitive tasks it is also important to provide the confidence in the predictions. Here, we propose a way to learn and use probabilistic labels to train accurate and calibrated deep networks from relatively small datasets. We observe gains of up to 22% in the accuracy of models trained with these labels, as compared with traditional approaches, in three classification tasks: diagnosis of hip dysplasia, fatty liver, and glaucoma. The outputs of models trained with probabilistic labels are calibrated, allowing the interpretation of its predictions as proper probabilities. We anticipate this approach will apply to other tasks where few training instances are available and expert knowledge can be encoded as probabilities.

LGOct 24, 2020
Shared Space Transfer Learning for analyzing multi-site fMRI data

Muhammad Yousefnezhad, Alessandro Selvitella, Daoqiang Zhang et al.

Multi-voxel pattern analysis (MVPA) learns predictive models from task-based functional magnetic resonance imaging (fMRI) data, for distinguishing when subjects are performing different cognitive tasks -- e.g., watching movies or making decisions. MVPA works best with a well-designed feature set and an adequate sample size. However, most fMRI datasets are noisy, high-dimensional, expensive to collect, and with small sample sizes. Further, training a robust, generalized predictive model that can analyze homogeneous cognitive tasks provided by multi-site fMRI datasets has additional challenges. This paper proposes the Shared Space Transfer Learning (SSTL) as a novel transfer learning (TL) approach that can functionally align homogeneous multi-site fMRI datasets, and so improve the prediction performance in every site. SSTL first extracts a set of common features for all subjects in each site. It then uses TL to map these site-specific features to a site-independent shared space in order to improve the performance of the MVPA. SSTL uses a scalable optimization procedure that works effectively for high-dimensional fMRI datasets. The optimization procedure extracts the common features for each site by using a single-iteration algorithm and maps these site-specific common features to the site-independent shared space. We evaluate the effectiveness of the proposed method for transferring between various cognitive tasks. Our comprehensive experiments validate that SSTL achieves superior performance to other state-of-the-art analysis techniques.

LGDec 19, 2019
Reducing Selection Bias in Counterfactual Reasoning for Individual Treatment Effects Estimation

Zichen Zhang, Qingfeng Lan, Lei Ding et al.

Counterfactual reasoning is an important paradigm applicable in many fields, such as healthcare, economics, and education. In this work, we propose a novel method to address the issue of \textit{selection bias}. We learn two groups of latent random variables, where one group corresponds to variables that only cause selection bias, and the other group is relevant for outcome prediction. They are learned by an auto-encoder where an additional regularized loss based on Pearson Correlation Coefficient (PCC) encourages the de-correlation between the two groups of random variables. This allows for explicitly alleviating selection bias by only keeping the latent variables that are relevant for estimating individual treatment effects. Experimental results on a synthetic toy dataset and a benchmark dataset show that our algorithm is able to achieve state-of-the-art performance and improve the result of its counterpart that does not explicitly model the selection bias.

LGSep 11, 2019
Domain Aggregation Networks for Multi-Source Domain Adaptation

Junfeng Wen, Russell Greiner, Dale Schuurmans

In many real-world applications, we want to exploit multiple source datasets of similar tasks to learn a model for a different but related target dataset -- e.g., recognizing characters of a new font using a set of different fonts. While most recent research has considered ad-hoc combination rules to address this problem, we extend previous work on domain discrepancy minimization to develop a finite-sample generalization bound, and accordingly propose a theoretically justified optimization procedure. The algorithm we develop, Domain AggRegation Network (DARN), is able to effectively adjust the weight of each source domain during training to ensure relevant domains are given more importance for adaptation. We evaluate the proposed method on real-world sentiment analysis and digit recognition datasets and show that DARN can significantly outperform the state-of-the-art alternatives.

LGMar 29, 2019
The Challenge of Predicting Meal-to-meal Blood Glucose Concentrations for Patients with Type I Diabetes

Neil C. Borle, Edmond A. Ryan, Russell Greiner

Patients with Type I Diabetes (T1D) must take insulin injections to prevent the serious long term effects of hyperglycemia - high blood glucose (BG). Patients must also be careful not to inject too much insulin because this could induce hypoglycemia (low BG), which can potentially be fatal. Patients therefore follow a "regimen" that determines how much insulin to inject at certain times. Current methods for managing this disease require adjusting the patient's regimen over time based on the disease's behavior (recorded in the patient's diabetes diary). If we can accurately predict a patient's future BG values from his/her current features (e.g., predicting today's lunch BG value given today's diabetes diary entry for breakfast, including insulin injections, and perhaps earlier entries), then it is relatively easy to produce an effective regimen. This study explores the challenges of BG modeling by applying several machine learning algorithms and various data preprocessing variations (corresponding to 312 [learner, preprocessed-dataset] combinations), to a new T1D dataset containing 29 601 entries from 47 different patients. Our most accurate predictor is a weighted ensemble of two Gaussian Process Regression models, which achieved an errL1 loss of 2.70 mmol/L (48.65 mg/dl). This was an unexpectedly poor result given that one can obtain an errL1 of 2.91 mmol/L (52.43 mg/dl) using the naive approach of simply predicting the patient's average BG. For each of data-variant/model combination we report several evaluation metrics, including glucose-specific metrics, and find similarly disappointing results (the best model was only incrementally better than the simplest measure). These results suggest that the diabetes diary data that is typically collected may not be sufficient to produce accurate BG prediction models; additional data may be necessary to build accurate BG prediction models.

LGMar 25, 2019
Gene Expression based Survival Prediction for Cancer Patients: A Topic Modeling Approach

Luke Kumar, Russell Greiner

Cancer is one of the leading cause of death, worldwide. Many believe that genomic data will enable us to better predict the survival time of these patients, which will lead to better, more personalized treatment options and patient care. As standard survival prediction models have a hard time coping with the high-dimensionality of such gene expression (GE) data, many projects use some dimensionality reduction techniques to overcome this hurdle. We introduce a novel methodology, inspired by topic modeling from the natural language domain, to derive expressive features from the high-dimensional GE data. There, a document is represented as a mixture over a relatively small number of topics, where each topic corresponds to a distribution over the words; here, to accommodate the heterogeneity of a patient's cancer, we represent each patient (~document) as a mixture over cancer-topics, where each cancer-topic is a mixture over GE values (~words). This required some extensions to the standard LDA model eg: to accommodate the "real-valued" expression values - leading to our novel "discretized" Latent Dirichlet Allocation (dLDA) procedure. We initially focus on the METABRIC dataset, which describes breast cancer patients using the r=49,576 GE values, from microarrays. Our results show that our approach provides survival estimates that are more accurate than standard models, in terms of the standard Concordance measure. We then validate this approach by running it on the Pan-kidney (KIPAN) dataset, over r=15,529 GE values - here using the mRNAseq modality - and find that it again achieves excellent results. In both cases, we also show that the resulting model is calibrated, using the recent "D-calibrated" measure. These successes, in two different cancer types and expression modalities, demonstrates the generality, and the effectiveness, of this approach.

LGNov 28, 2018
Effective Ways to Build and Evaluate Individual Survival Distributions

Humza Haider, Bret Hoehn, Sarah Davis et al.

An accurate model of a patient's individual survival distribution can help determine the appropriate treatment for terminal patients. Unfortunately, risk scores (e.g., from Cox Proportional Hazard models) do not provide survival probabilities, single-time probability models (e.g., the Gail model, predicting 5 year probability) only provide for a single time point, and standard Kaplan-Meier survival curves provide only population averages for a large class of patients meaning they are not specific to individual patients. This motivates an alternative class of tools that can learn a model which provides an individual survival distribution which gives survival probabilities across all times - such as extensions to the Cox model, Accelerated Failure Time, an extension to Random Survival Forests, and Multi-Task Logistic Regression. This paper first motivates such "individual survival distribution" (ISD) models, and explains how they differ from standard models. It then discusses ways to evaluate such models - namely Concordance, 1-Calibration, Brier score, and various versions of L1-loss - and then motivates and defines a novel approach "D-Calibration", which determines whether a model's probability estimates are meaningful. We also discuss how these measures differ, and use them to evaluate several ISD prediction tools, over a range of survival datasets.

CVDec 1, 2017
Learning Neural Markers of Schizophrenia Disorder Using Recurrent Neural Networks

Jumana Dakka, Pouya Bashivan, Mina Gheiratmand et al.

Smart systems that can accurately diagnose patients with mental disorders and identify effective treatments based on brain functional imaging data are of great applicability and are gaining much attention. Most previous machine learning studies use hand-designed features, such as functional connectivity, which does not maintain the potential useful information in the spatial relationship between brain regions and the temporal profile of the signal in each region. Here we propose a new method based on recurrent-convolutional neural networks to automatically learn useful representations from segments of 4-D fMRI recordings. Our goal is to exploit both spatial and temporal information in the functional MRI movie (at the whole-brain voxel level) for identifying patients with schizophrenia.

MLJan 1, 2016
Stochastic Neural Networks with Monotonic Activation Functions

Siamak Ravanbakhsh, Barnabas Poczos, Jeff Schneider et al.

We propose a Laplace approximation that creates a stochastic unit from any smooth monotonic activation function, using only Gaussian noise. This paper investigates the application of this stochastic approximation in training a family of Restricted Boltzmann Machines (RBM) that are closely linked to Bregman divergences. This family, that we call exponential family RBM (Exp-RBM), is a subset of the exponential family Harmoniums that expresses family members through a choice of smooth monotonic non-linearity for each neuron. Using contrastive divergence along with our Gaussian approximation, we show that Exp-RBM can learn useful representations using novel stochastic units.

STSep 28, 2015
Boolean Matrix Factorization and Noisy Completion via Message Passing

Siamak Ravanbakhsh, Barnabas Poczos, Russell Greiner

Boolean matrix factorization and Boolean matrix completion from noisy observations are desirable unsupervised data-analysis methods due to their interpretability, but hard to perform due to their NP-hardness. We treat these problems as maximum a posteriori inference problems in a graphical model and present a message passing approach that scales linearly with the number of observations and factors. Our empirical study demonstrates that message passing is able to recover low-rank Boolean matrices, in the boundaries of theoretically possible recovery and compares favorably with state-of-the-art in real-world applications, such collaborative filtering with large-scale Boolean data.

AISep 25, 2014
Revisiting Algebra and Complexity of Inference in Graphical Models

Siamak Ravanbakhsh, Russell Greiner

This paper studies the form and complexity of inference in graphical models using the abstraction offered by algebraic structures. In particular, we broadly formalize inference problems in graphical models by viewing them as a sequence of operations based on commutative semigroups. We then study the computational complexity of inference by organizing various problems into an "inference hierarchy". When the underlying structure of an inference problem is a commutative semiring -- i.e. a combination of two commutative semigroups with the distributive law -- a message passing procedure called belief propagation can leverage this distributive law to perform polynomial-time inference for certain problems. After establishing the NP-hardness of inference in any commutative semiring, we investigate the relation between algebraic properties in this setting and further show that polynomial-time inference using distributive law does not (trivially) extend to inference problems that are expressed using more than two commutative semigroups. We then extend the algebraic treatment of message passing procedures to survey propagation, providing a novel perspective using a combination of two commutative semirings. This formulation generalizes the application of survey propagation to new settings.

AISep 4, 2014
Accurate, fully-automated NMR spectral profiling for metabolomics

Siamak Ravanbakhsh, Philip Liu, Trent Bjorndahl et al.

Many diseases cause significant changes to the concentrations of small molecules (aka metabolites) that appear in a person's biofluids, which means such diseases can often be readily detected from a person's "metabolic profile". This information can be extracted from a biofluid's NMR spectrum. Today, this is often done manually by trained human experts, which means this process is relatively slow, expensive and error-prone. This paper presents a tool, Bayesil, that can quickly, accurately and autonomously produce a complex biofluid's (e.g., serum or CSF) metabolic profile from a 1D1H NMR spectrum. This requires first performing several spectral processing steps then matching the resulting spectrum against a reference compound library, which contains the "signatures" of each relevant metabolite. Many of these steps are novel algorithms and our matching step views spectral matching as an inference problem within a probabilistic graphical model that rapidly approximates the most probable metabolic profile. Our extensive studies on a diverse set of complex mixtures, show that Bayesil can autonomously find the concentration of all NMR-detectable metabolites accurately (~90% correct identification and ~10% quantification error), in <5minutes on a single CPU. These results demonstrate that Bayesil is the first fully-automatic publicly-accessible system that provides quantitative NMR spectral profiling effectively -- with an accuracy that meets or exceeds the performance of trained experts. We anticipate this tool will usher in high-throughput metabolomics and enable a wealth of new applications of NMR in clinical settings. Available at http://www.bayesil.ca.

AIJun 4, 2014
Augmentative Message Passing for Traveling Salesman Problem and Graph Partitioning

Siamak Ravanbakhsh, Reihaneh Rabbany, Russell Greiner

The cutting plane method is an augmentative constrained optimization procedure that is often used with continuous-domain optimization techniques such as linear and convex programs. We investigate the viability of a similar idea within message passing -- which produces integral solutions -- in the context of two combinatorial problems: 1) For Traveling Salesman Problem (TSP), we propose a factor-graph based on Held-Karp formulation, with an exponential number of constraint factors, each of which has an exponential but sparse tabular form. 2) For graph-partitioning (a.k.a., community mining) using modularity optimization, we introduce a binary variable model with a large number of constraints that enforce formation of cliques. In both cases we are able to derive surprisingly simple message updates that lead to competitive solutions on benchmark instances. In particular for TSP we are able to find near-optimal solutions in the time that empirically grows with N^3, demonstrating that augmentation is practical and efficient.

NEMay 6, 2014
Training Restricted Boltzmann Machine by Perturbation

Siamak Ravanbakhsh, Russell Greiner, Brendan Frey

A new approach to maximum likelihood learning of discrete graphical models and RBM in particular is introduced. Our method, Perturb and Descend (PD) is inspired by two ideas (I) perturb and MAP method for sampling (II) learning by Contrastive Divergence minimization. In contrast to perturb and MAP, PD leverages training data to learn the models that do not allow efficient MAP estimation. During the learning, to produce a sample from the current model, we start from a training data and descend in the energy landscape of the "perturbed model", for a fixed number of steps, or until a local optima is reached. For RBM, this involves linear calculations and thresholding which can be very fast. Furthermore we show that the amount of perturbation is closely related to the temperature parameter and it can regularize the model by producing robust features resulting in sparse hidden layer activation.

AIJan 26, 2014
Perturbed Message Passing for Constraint Satisfaction Problems

Siamak Ravanbakhsh, Russell Greiner

We introduce an efficient message passing scheme for solving Constraint Satisfaction Problems (CSPs), which uses stochastic perturbation of Belief Propagation (BP) and Survey Propagation (SP) messages to bypass decimation and directly produce a single satisfying assignment. Our first CSP solver, called Perturbed Blief Propagation, smoothly interpolates two well-known inference procedures; it starts as BP and ends as a Gibbs sampler, which produces a single sample from the set of solutions. Moreover we apply a similar perturbation scheme to SP to produce another CSP solver, Perturbed Survey Propagation. Experimental results on random and real-world CSPs show that Perturbed BP is often more successful and at the same time tens to hundreds of times more efficient than standard BP guided decimation. Perturbed BP also compares favorably with state-of-the-art SP-guided decimation, which has a computational complexity that generally scales exponentially worse than our method (wrt the cardinality of variable domains and constraints). Furthermore, our experiments with random satisfiability and coloring problems demonstrate that Perturbed SP can outperform SP-guided decimation, making it the best incomplete random CSP-solver in difficult regimes.

AIFeb 6, 2013
Learning Bayesian Nets that Perform Well

Russell Greiner, Adam J. Grove, Dale Schuurmans

A Bayesian net (BN) is more than a succinct way to encode a probabilistic distribution; it also corresponds to a function used to answer queries. A BN can therefore be evaluated by the accuracy of the answers it returns. Many algorithms for learning BNs, however, attempt to optimize another criterion (usually likelihood, possibly augmented with a regularizing term), which is independent of the distribution of queries that are posed. This paper takes the "performance criteria" seriously, and considers the challenge of computing the BN whose performance - read "accuracy over the distribution of queries" - is optimal. We show that many aspects of this learning task are more difficult than the corresponding subtasks in the standard model.

LGJan 23, 2013
Comparing Bayesian Network Classifiers

Jie Cheng, Russell Greiner

In this paper, we empirically evaluate algorithms for learning four types of Bayesian network (BN) classifiers - Naive-Bayes, tree augmented Naive-Bayes, BN augmented Naive-Bayes and general BNs, where the latter two are learned using two variants of a conditional-independence (CI) based BN-learning algorithm. Experimental results show the obtained classifiers, learned using the CI based algorithms, are competitive with (or superior to) the best known classifiers, based on both Bayesian networks and other formalisms; and that the computational time for learning and using these classifiers is relatively small. Moreover, these results also suggest a way to learn yet more effective classifiers; we demonstrate empirically that this new algorithm does work as expected. Collectively, these results argue that BN classifiers deserve more attention in machine learning and data mining communities.

AIJan 10, 2013
Bayesian Error-Bars for Belief Net Inference

Tim Van Allen, Russell Greiner, Peter Hooper

A Bayesian Belief Network (BN) is a model of a joint distribution over a setof n variables, with a DAG structure to represent the immediate dependenciesbetween the variables, and a set of parameters (aka CPTables) to represent thelocal conditional probabilities of a node, given each assignment to itsparents. In many situations, these parameters are themselves random variables - this may reflect the uncertainty of the domain expert, or may come from atraining sample used to estimate the parameter values. The distribution overthese "CPtable variables" induces a distribution over the response the BNwill return to any "What is Pr(H | E)?" query. This paper investigates thevariance of this response, showing first that it is asymptotically normal,then providing its mean and asymptotical variance. We then present aneffective general algorithm for computing this variance, which has the samecomplexity as simply computing the (mean value of) the response itself - ie,O(n 2^w), where n is the number of variables and w is the effective treewidth. Finally, we provide empirical evidence that this algorithm, whichincorporates assumptions and approximations, works effectively in practice,given only small samples.

LGOct 19, 2012
Budgeted Learning of Naive-Bayes Classifiers

Daniel J. Lizotte, Omid Madani, Russell Greiner

Frequently, acquiring training data has an associated cost. We consider the situation where the learner may purchase data during training, subject TO a budget. IN particular, we examine the CASE WHERE each feature label has an associated cost, AND the total cost OF ALL feature labels acquired during training must NOT exceed the budget.This paper compares methods FOR choosing which feature label TO purchase next, given the budget AND the CURRENT belief state OF naive Bayes model parameters.Whereas active learning has traditionally focused ON myopic(greedy) strategies FOR query selection, this paper presents a tractable method FOR incorporating knowledge OF the budget INTO the decision making process, which improves performance.

LGJul 11, 2012
Active Model Selection

Omid Madani, Daniel J. Lizotte, Russell Greiner

Classical learning assumes the learner is given a labeled data sample, from which it learns a model. The field of Active Learning deals with the situation where the learner begins not with a training sample, but instead with resources that it can use to obtain information to help identify the optimal model. To better understand this task, this paper presents and analyses the simplified "(budgeted) active model selection" version, which captures the pure exploration aspect of many active learning problems in a clean and simple problem formulation. Here the learner can use a fixed budget of "model probes" (where each probe evaluates the specified model on a random indistinguishable instance) to identify which of a given set of possible models has the highest expected accuracy. Our goal is a policy that sequentially determines which model to probe next, based on the information observed so far. We present a formal description of this task, and show that it is NPhard in general. We then investigate a number of algorithms for this task, including several existing ones (eg, "Round-Robin", "Interval Estimation", "Gittins") as well as some novel ones (e.g., "Biased-Robin"), describing first their approximation properties and then their empirical performance on various problem instances. We observe empirically that the simple biased-robin algorithm significantly outperforms the other algorithms in the case of identical costs and priors.

AIJun 18, 2012
A Generalized Loop Correction Method for Approximate Inference in Graphical Models

Siamak Ravanbakhsh, Chun-Nam Yu, Russell Greiner

Belief Propagation (BP) is one of the most popular methods for inference in probabilistic graphical models. BP is guaranteed to return the correct answer for tree structures, but can be incorrect or non-convergent for loopy graphical models. Recently, several new approximate inference algorithms based on cavity distribution have been proposed. These methods can account for the effect of loops by incorporating the dependency between BP messages. Alternatively, region-based approximations (that lead to methods such as Generalized Belief Propagation) improve upon BP by considering interactions within small clusters of variables, thus taking small loops within these clusters into account. This paper introduces an approach, Generalized Loop Correction (GLC), that benefits from both of these types of loop correction. We show how GLC relates to these two families of inference methods, then provide empirical evidence that GLC works effectively in general, and can be significantly more accurate than both correction schemes.

AIJun 13, 2012
Speeding Up Planning in Markov Decision Processes via Automatically Constructed Abstractions

Alejandro Isaza, Csaba Szepesvari, Vadim Bulitko et al.

In this paper, we consider planning in stochastic shortest path (SSP) problems, a subclass of Markov Decision Problems (MDP). We focus on medium-size problems whose state space can be fully enumerated. This problem has numerous important applications, such as navigation and planning under uncertainty. We propose a new approach for constructing a multi-level hierarchy of progressively simpler abstractions of the original problem. Once computed, the hierarchy can be used to speed up planning by first finding a policy for the most abstract level and then recursively refining it into a solution to the original problem. This approach is fully automated and delivers a speed-up of two orders of magnitude over a state-of-the-art MDP solver on sample problems while returning near-optimal solutions. We also prove theoretical bounds on the loss of solution optimality resulting from the use of abstractions.

AIMay 9, 2012
Improved Mean and Variance Approximations for Belief Net Responses via Network Doubling

Peter Hooper, Yasin Abbasi-Yadkori, Russell Greiner et al.

A Bayesian belief network models a joint distribution with an directed acyclic graph representing dependencies among variables and network parameters characterizing conditional distributions. The parameters are viewed as random variables to quantify uncertainty about their values. Belief nets are used to compute responses to queries; i.e., conditional probabilities of interest. A query is a function of the parameters, hence a random variable. Van Allen et al. (2001, 2008) showed how to quantify uncertainty about a query via a delta method approximation of its variance. We develop more accurate approximations for both query mean and variance. The key idea is to extend the query mean approximation to a "doubled network" involving two independent replicates. Our method assumes complete data and can be applied to discrete, continuous, and hybrid networks (provided discrete variables have only discrete parents). We analyze several improvements, and provide empirical studies to demonstrate their effectiveness.