MEJun 14, 2019
Identify treatment effect patterns for personalised decisionsJiuyong Li, Lin Liu, Shisheng Zhang et al.
In personalised decision making, evidence is required to determine whether an action (treatment) is suitable for an individual. Such evidence can be obtained by modelling treatment effect heterogeneity in subgroups. The existing interpretable modelling methods take a top-down approach to search for subgroups with heterogeneous treatment effects and they may miss the most specific and relevant context for an individual. In this paper, we design a \emph{Treatment effect pattern (TEP)} to represent treatment effect heterogeneity in data. To achieve an interpretable presentation of TEPs, we use a local causal structure around the outcome to explicitly show how those important variables are used in modelling. We also derive a formula for unbiasedly estimating the \emph{Conditional Average Causal Effect (CATE)} using the local structure in our problem setting. In the discovery process, we aim at minimising heterogeneity within each subgroup represented by a pattern. We propose a bottom-up search algorithm to discover the most specific patterns fitting individual circumstances the best for personalised decision making. Experiments show that the proposed method models treatment effect heterogeneity better than three other existing tree based methods in synthetic and real world data sets.
AIAug 20, 2018
Discovering Context Specific Causal RelationshipsSaisai Ma, Jiuyong Li, Lin Liu et al.
With the increasing need of personalised decision making, such as personalised medicine and online recommendations, a growing attention has been paid to the discovery of the context and heterogeneity of causal relationships. Most existing methods, however, assume a known cause (e.g. a new drug) and focus on identifying from data the contexts of heterogeneous effects of the cause (e.g. patient groups with different responses to the new drug). There is no approach to efficiently detecting directly from observational data context specific causal relationships, i.e. discovering the causes and their contexts simultaneously. In this paper, by taking the advantages of highly efficient decision tree induction and the well established causal inference framework, we propose the Tree based Context Causal rule discovery (TCC) method, for efficient exploration of context specific causal relationships from data. Experiments with both synthetic and real world data sets show that TCC can effectively discover context specific causal rules from the data.
AIAug 28, 2015
Mining Combined Causes in Large Data SetsSaisai Ma, Jiuyong Li, Lin Liu et al.
In recent years, many methods have been developed for detecting causal relationships in observational data. Some of them have the potential to tackle large data sets. However, these methods fail to discover a combined cause, i.e. a multi-factor cause consisting of two or more component variables which individually are not causes. A straightforward approach to uncovering a combined cause is to include both individual and combined variables in the causal discovery using existing methods, but this scheme is computationally infeasible due to the huge number of combined variables. In this paper, we propose a novel approach to address this practical causal discovery problem, i.e. mining combined causes in large data sets. The experiments with both synthetic and real world data sets show that the proposed method can obtain high-quality causal discoveries with a high computational efficiency.
AIAug 16, 2015
From Observational Studies to Causal Rule MiningJiuyong Li, Thuc Duy Le, Lin Liu et al.
Randomised controlled trials (RCTs) are the most effective approach to causal discovery, but in many circumstances it is impossible to conduct RCTs. Therefore observational studies based on passively observed data are widely accepted as an alternative to RCTs. However, in observational studies, prior knowledge is required to generate the hypotheses about the cause-effect relationships to be tested, hence they can only be applied to problems with available domain knowledge and a handful of variables. In practice, many data sets are of high dimensionality, which leaves observational studies out of the opportunities for causal discovery from such a wealth of data sources. In another direction, many efficient data mining methods have been developed to identify associations among variables in large data sets. The problem is, causal relationships imply associations, but the reverse is not always true. However we can see the synergy between the two paradigms here. Specifically, association rule mining can be used to deal with the high-dimensionality problem while observational studies can be utilised to eliminate non-causal associations. In this paper we propose the concept of causal rules (CRs) and develop an algorithm for mining CRs in large data sets. We use the idea of retrospective cohort studies to detect CRs based on the results of association rule mining. Experiments with both synthetic and real world data sets have demonstrated the effectiveness and efficiency of CR mining. In comparison with the commonly used causal discovery methods, the proposed approach in general is faster and has better or competitive performance in finding correct or sensible causes. It is also capable of finding a cause consisting of multiple variables, a feature that other causal discovery methods do not possess.
AIAug 16, 2015
Causal Decision TreesJiuyong Li, Saisai Ma, Thuc Duy Le et al.
Uncovering causal relationships in data is a major objective of data analytics. Causal relationships are normally discovered with designed experiments, e.g. randomised controlled trials, which, however are expensive or infeasible to be conducted in many cases. Causal relationships can also be found using some well designed observational studies, but they require domain experts' knowledge and the process is normally time consuming. Hence there is a need for scalable and automated methods for causal relationship exploration in data. Classification methods are fast and they could be practical substitutes for finding causal signals in data. However, classification methods are not designed for causal discovery and a classification method may find false causal signals and miss the true ones. In this paper, we develop a causal decision tree where nodes have causal interpretations. Our method follows a well established causal inference framework and makes use of a classic statistical test. The method is practical for finding causal signals in large data sets.