Weakly-supervised causal discovery based on fuzzy knowledge and complex data complementarity
This work addresses causal discovery challenges in high-dimensional, small-sample scenarios, such as in biology, by reducing reliance on extensive domain expertise, though it appears incremental in its approach.
The paper tackles the problem of causal discovery from observational data by proposing KEEL, a weakly-supervised method that integrates fuzzy knowledge and data complementarity, which outperforms state-of-the-art methods in accuracy, robustness, and efficiency, as demonstrated in experiments including real protein signal transduction processes.
Causal discovery based on observational data is important for deciphering the causal mechanism behind complex systems. However, the effectiveness of existing causal discovery methods is limited due to inferior prior knowledge, domain inconsistencies, and the challenges of high-dimensional datasets with small sample sizes. To address this gap, we propose a novel weakly-supervised fuzzy knowledge and data co-driven causal discovery method named KEEL. KEEL adopts a fuzzy causal knowledge schema to encapsulate diverse types of fuzzy knowledge, and forms corresponding weakened constraints. This schema not only lessens the dependency on expertise but also allows various types of limited and error-prone fuzzy knowledge to guide causal discovery. It can enhance the generalization and robustness of causal discovery, especially in high-dimensional and small-sample scenarios. In addition, we integrate the extended linear causal model (ELCM) into KEEL for dealing with the multi-distribution and incomplete data. Extensive experiments with different datasets demonstrate the superiority of KEEL over several state-of-the-art methods in accuracy, robustness and computational efficiency. For causal discovery in real protein signal transduction processes, KEEL outperforms the benchmark method with limited data. In summary, KEEL is effective to tackle the causal discovery tasks with higher accuracy while alleviating the requirement for extensive domain expertise.