A causation coefficient and taxonomy of correlation/causation relationships
This work addresses a foundational issue in statistics and data science by providing a tool to clarify correlation-causation confusion, though it appears incremental as it builds on existing probabilistic causal models.
The paper tackles the problem of distinguishing correlation from causation by introducing a causation coefficient as a causal analogue to Pearson correlation, enabling a rigorous comparison and classification of possible relationships, with examples provided on real data.
This paper introduces a causation coefficient which is defined in terms of probabilistic causal models. This coefficient is suggested as the natural causal analogue of the Pearson correlation coefficient and permits comparing causation and correlation to each other in a simple, yet rigorous manner. Together, these coefficients provide a natural way to classify the possible correlation/causation relationships that can occur in practice and examples of each relationship are provided. In addition, the typical relationship between correlation and causation is analyzed to provide insight into why correlation and causation are often conflated. Finally, example calculations of the causation coefficient are shown on a real data set.