AIAug 28, 2023

Causality-Based Feature Importance Quantifying Methods: PN-FI, PS-FI and PNS-FI

arXiv:2308.14474v2h-index: 16
Originality Incremental advance
AI Analysis

This work addresses feature selection for improving model training efficiency in ML, but it is incremental as it applies existing causality concepts to a new context.

The paper tackles the problem of quantifying feature importance for feature selection in machine learning by introducing three new causality-based methods (PN-FI, PS-FI, PNS-FI) that calculate probabilities of necessity and sufficiency, and shows through experiments that these methods produce FI values as intervals with tight bounds, with dog eyes identified as the most important feature in image tasks.

In the current ML field models are getting larger and more complex, and data used for model training are also getting larger in quantity and higher in dimensions. Therefore, in order to train better models, and save training time and computational resources, a good Feature Selection (FS) method in the preprocessing stage is necessary. Feature importance (FI) is of great importance since it is the basis of feature selection. Therefore, this paper creatively introduces the calculation of PN (the probability of Necessity), PN (the probability of Sufficiency), and PNS (the probability of Necessity and Sufficiency) of Causality into quantifying feature importance and creates 3 new FI measuring methods, PN-FI, which means how much importance a feature has in image recognition tasks, PS-FI that means how much importance a feature has in image generating tasks, and PNS-FI which measures both. The main body of this paper is three RCTs, with whose results we show how PS-FI, PN-FI, and PNS-FI of 3 features, dog nose, dog eyes, and dog mouth are calculated. The experiments show that firstly, FI values are intervals with tight upper and lower bounds. Secondly, the feature dog eyes has the most importance while the other two have almost the same. Thirdly, the bounds of PNS and PN are tighter than the bounds of PS.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes