Bo Christer Bertilson

2papers

2 Papers

LGJun 4, 2019
Neuropathic Pain Diagnosis Simulator for Causal Discovery Algorithm Evaluation

Ruibo Tu, Kun Zhang, Bo Christer Bertilson et al.

Discovery of causal relations from observational data is essential for many disciplines of science and real-world applications. However, unlike other machine learning algorithms, whose development has been greatly fostered by a large amount of available benchmark datasets, causal discovery algorithms are notoriously difficult to be systematically evaluated because few datasets with known ground-truth causal relations are available. In this work, we handle the problem of evaluating causal discovery algorithms by building a flexible simulator in the medical setting. We develop a neuropathic pain diagnosis simulator, inspired by the fact that the biological processes of neuropathic pathophysiology are well studied with well-understood causal influences. Our simulator exploits the causal graph of the neuropathic pain pathology and its parameters in the generator are estimated from real-life patient cases. We show that the data generated from our simulator have similar statistics as real-world data. As a clear advantage, the simulator can produce infinite samples without jeopardizing the privacy of real-world patients. Our simulator provides a natural tool for evaluating various types of causal discovery algorithms, including those to deal with practical issues in causal discovery, such as unknown confounders, selection bias, and missing data. Using our simulator, we have evaluated extensively causal discovery algorithms under various settings.

LGJul 11, 2018
Causal Discovery in the Presence of Missing Data

Ruibo Tu, Kun Zhang, Paul Ackermann et al.

Missing data are ubiquitous in many domains including healthcare. When these data entries are not missing completely at random, the (conditional) independence relations in the observed data may be different from those in the complete data generated by the underlying causal process. Consequently, simply applying existing causal discovery methods to the observed data may lead to wrong conclusions. In this paper, we aim at developing a causal discovery method to recover the underlying causal structure from observed data that follow different missingness mechanisms, including missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR). With missingness mechanisms represented by missingness graphs, we analyse conditions under which additional correction is needed to derive conditional independence/dependence relations in the complete data. Based on our analysis, we propose the Missing Value PC (MVPC) algorithm for both continuous and binary variables, which extends the PC algorithm to incorporate additional corrections. Our proposed MVPC is shown in theory to give asymptotically correct results even on data that are MAR or MNAR. Experimental results on synthetic data show that the proposed algorithm is able to find correct causal relations even in the general case of MNAR. Moreover, we create a neuropathic pain diagnostic simulator for evaluating causal discovery methods. Evaluated on such simulated neuropathic pain diagnosis records and the other two real world applications, MVPC outperforms the other benchmark methods.