LGMLJun 12, 2023

Prediction Algorithms Achieving Bayesian Decision Theoretical Optimality Based on Decision Trees as Data Observation Processes

arXiv:2306.07060v1h-index: 14
Originality Incremental advance
AI Analysis

This work addresses a computational bottleneck for researchers in statistical machine learning, offering an incremental improvement over prior methods that used trees to model data observation processes.

The paper tackles the computational infeasibility of computing Bayes optimal predictions in decision trees by introducing an adaptive Markov chain Monte Carlo method to handle the summation over all possible feature space divisions, achieving a solution to this open problem.

In the field of decision trees, most previous studies have difficulty ensuring the statistical optimality of a prediction of new data and suffer from overfitting because trees are usually used only to represent prediction functions to be constructed from given data. In contrast, some studies, including this paper, used the trees to represent stochastic data observation processes behind given data. Moreover, they derived the statistically optimal prediction, which is robust against overfitting, based on the Bayesian decision theory by assuming a prior distribution for the trees. However, these studies still have a problem in computing this Bayes optimal prediction because it involves an infeasible summation for all division patterns of a feature space, which is represented by the trees and some parameters. In particular, an open problem is a summation with respect to combinations of division axes, i.e., the assignment of features to inner nodes of the tree. We solve this by a Markov chain Monte Carlo method, whose step size is adaptively tuned according to a posterior distribution for the trees.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes