LGMLJan 24, 2022

Probability Distribution on Rooted Trees

arXiv:2201.09460v1
AI Analysis

This work addresses tree selection issues in data compression, image processing, and machine learning, but appears incremental as it extends prior Bayesian approaches from full trees to more general cases.

The paper tackles the problem of overfitting in tree selection for hierarchical models by proposing a generalized probability distribution for any rooted trees with fixed maximum child nodes and depth, and derives recursive methods to evaluate its characteristics without approximations.

The hierarchical and recursive expressive capability of rooted trees is applicable to represent statistical models in various areas, such as data compression, image processing, and machine learning. On the other hand, such hierarchical expressive capability causes a problem in tree selection to avoid overfitting. One unified approach to solve this is a Bayesian approach, on which the rooted tree is regarded as a random variable and a direct loss function can be assumed on the selected model or the predicted value for a new data point. However, all the previous studies on this approach are based on the probability distribution on full trees, to the best of our knowledge. In this paper, we propose a generalized probability distribution for any rooted trees in which only the maximum number of child nodes and the maximum depth are fixed. Furthermore, we derive recursive methods to evaluate the characteristics of the probability distribution without any approximations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes