MLLGFeb 7, 2024

Riemann-Lebesgue Forest for Regression

arXiv:2402.04550v3h-index: 1Trans. Mach. Learn. Res.
AI Analysis

This is an incremental improvement for regression tasks in machine learning, potentially benefiting practitioners needing more accurate ensemble models.

The authors tackled the problem of improving regression ensemble methods by proposing Riemann-Lebesgue Forest (RLF), which uses a novel tree learner with Lebesgue-type cutting to achieve larger variance reduction than standard CART, and demonstrated competitive performance against random forest in simulations and real-world datasets.

We propose a novel ensemble method called Riemann-Lebesgue Forest (RLF) for regression. The core idea in RLF is to mimic the way how a measurable function can be approximated by partitioning its range into a few intervals. With this idea in mind, we develop a new tree learner named Riemann-Lebesgue Tree (RLT) which has a chance to perform Lebesgue type cutting,i.e splitting the node from response $Y$ at certain non-terminal nodes. We show that the optimal Lebesgue type cutting results in larger variance reduction in response $Y$ than ordinary CART \cite{Breiman1984ClassificationAR} cutting (an analogue of Riemann partition). Such property is beneficial to the ensemble part of RLF. We also generalize the asymptotic normality of RLF under different parameter settings. Two one-dimensional examples are provided to illustrate the flexibility of RLF. The competitive performance of RLF against original random forest \cite{Breiman2001RandomF} is demonstrated by experiments in simulation data and real world datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes