Analyzing the tree-layer structure of Deep Forests
This work provides theoretical insights into tree-network architectures for machine learning practitioners, but it is incremental as it builds on existing deep forest concepts.
The paper investigates the underlying mechanisms of deep forests, showing that simplified shallow forest networks can outperform standard tree-based methods, with theoretical bounds on excess risk indicating performance improvements for well-structured data.
Random forests on the one hand, and neural networks on the other hand, have met great success in the machine learning community for their predictive performance. Combinations of both have been proposed in the literature, notably leading to the so-called deep forests (DF) (Zhou \& Feng,2019). In this paper, our aim is not to benchmark DF performances but to investigate instead their underlying mechanisms. Additionally, DF architecture can be generally simplified into more simple and computationally efficient shallow forest networks. Despite some instability, the latter may outperform standard predictive tree-based methods. We exhibit a theoretical framework in which a shallow tree network is shown to enhance the performance of classical decision trees. In such a setting, we provide tight theoretical lower and upper bounds on its excess risk. These theoretical results show the interest of tree-network architectures for well-structured data provided that the first layer, acting as a data encoder, is rich enough.