Interpretability and causal discovery of the machine learning models to predict the production of CBM wells after hydraulic fracturing
This work addresses interpretability and generalization issues in a specific domain (CBM production prediction), representing an incremental improvement through hybrid methods.
The authors tackled the problem of low generalization and lack of interpretability in machine learning models for predicting coalbed methane (CBM) well production after hydraulic fracturing by proposing a novel methodology that combines causal discovery with SHAP analysis. Their approach improved forecasting accuracy by an average of 20% compared to traditional methods and produced results consistent with actual physical mechanisms.
Machine learning approaches are widely studied in the production prediction of CBM wells after hydraulic fracturing, but merely used in practice due to the low generalization ability and the lack of interpretability. A novel methodology is proposed in this article to discover the latent causality from observed data, which is aimed at finding an indirect way to interpret the machine learning results. Based on the theory of causal discovery, a causal graph is derived with explicit input, output, treatment and confounding variables. Then, SHAP is employed to analyze the influence of the factors on the production capability, which indirectly interprets the machine learning models. The proposed method can capture the underlying nonlinear relationship between the factors and the output, which remedies the limitation of the traditional machine learning routines based on the correlation analysis of factors. The experiment on the data of CBM shows that the detected relationship between the production and the geological/engineering factors by the presented method, is coincident with the actual physical mechanism. Meanwhile, compared with traditional methods, the interpretable machine learning models have better performance in forecasting production capability, averaging 20% improvement in accuracy.