LG MLJun 12, 2023

Prediction Algorithms Achieving Bayesian Decision Theoretical Optimality Based on Decision Trees as Data Observation Processes

Yuta Nakahara, Shota Saito, Naoki Ichijo, Koki Kazama, Toshiyasu Matsushima

arXiv:2306.07060v12.0h-index: 14

Originality Incremental advance

AI Analysis

This work addresses a computational bottleneck for researchers in statistical machine learning, offering an incremental improvement over prior methods that used trees to model data observation processes.

The paper tackles the computational infeasibility of computing Bayes optimal predictions in decision trees by introducing an adaptive Markov chain Monte Carlo method to handle the summation over all possible feature space divisions, achieving a solution to this open problem.

In the field of decision trees, most previous studies have difficulty ensuring the statistical optimality of a prediction of new data and suffer from overfitting because trees are usually used only to represent prediction functions to be constructed from given data. In contrast, some studies, including this paper, used the trees to represent stochastic data observation processes behind given data. Moreover, they derived the statistically optimal prediction, which is robust against overfitting, based on the Bayesian decision theory by assuming a prior distribution for the trees. However, these studies still have a problem in computing this Bayes optimal prediction because it involves an infeasible summation for all division patterns of a feature space, which is represented by the trees and some parameters. In particular, an open problem is a summation with respect to combinations of division axes, i.e., the assignment of features to inner nodes of the tree. We solve this by a Markov chain Monte Carlo method, whose step size is adaptively tuned according to a posterior distribution for the trees.

View on arXiv PDF

Similar