OC LG MLDec 4, 2024

Optimal probabilistic feature shifts for reclassification in tree ensembles

Víctor Blanco, Alberto Japón, Justo Puerto, Peter Zhang

arXiv:2412.03722v13.2h-index: 1

Originality Incremental advance

AI Analysis

This work addresses the need for interpretable and efficient reclassification strategies in machine learning, particularly for tree-based models, though it appears incremental as it builds on existing optimization and ensemble methods.

The authors tackled the problem of finding optimal feature perturbations for reclassifying observations in tree ensembles, developing a method that maximizes the probability of reaching a desired class by focusing on a few key features rather than minimal distance changes, and validated it on a real dataset.

In this paper we provide a novel mathematical optimization based methodology to perturb the features of a given observation to be re-classified, by a tree ensemble classification rule, to a certain desired class. The method is based on these facts: the most viable changes for an observation to reach the desired class do not always coincide with the closest distance point (in the feature space) of the target class; individuals put effort on a few number of features to reach the desired class; and each individual is endowed with a probability to change each of its features to a given value, which determines the overall probability of changing to the target class. Putting all together, we provide different methods to find the features where the individuals must exert effort to maximize the probability to reach the target class. Our method also allows us to rank the most important features in the tree-ensemble. The proposed methodology is tested on a real dataset, validating the proposal.

View on arXiv PDF

Similar