Making Tree Ensembles Interpretable
This work addresses the interpretability problem for users of tree ensembles, but it is incremental as it builds on existing methods without introducing a new paradigm.
The paper tackles the limited interpretability of tree ensembles like random forests by proposing a post-processing method that approximates complex ensembles with simpler, interpretable models using an EM algorithm minimizing KL divergence, achieving reasonable approximation in synthetic experiments.
Tree ensembles, such as random forest and boosted trees, are renowned for their high prediction performance, whereas their interpretability is critically limited. In this paper, we propose a post processing method that improves the model interpretability of tree ensembles. After learning a complex tree ensembles in a standard way, we approximate it by a simpler model that is interpretable for human. To obtain the simpler model, we derive the EM algorithm minimizing the KL divergence from the complex ensemble. A synthetic experiment showed that a complicated tree ensemble was approximated reasonably as interpretable.