Narrowing the Gap: Random Forests In Theory and In Practice
This work addresses the theoretical-practical gap in random forests for researchers and practitioners, but it is incremental as it builds on existing theoretical models.
The authors tackled the gap between theoretical understanding and practical performance of random forests by introducing a new theoretically tractable variant and proving its consistency, while empirically comparing it to other models and the practical algorithm to evaluate the impact of theoretical simplifications.
Despite widespread interest and practical use, the theoretical properties of random forests are still not well understood. In this paper we contribute to this understanding in two ways. We present a new theoretically tractable variant of random regression forests and prove that our algorithm is consistent. We also provide an empirical evaluation, comparing our algorithm and other theoretically tractable random forest models to the random forest algorithm used in practice. Our experiments provide insight into the relative importance of different simplifications that theoreticians have made to obtain tractable models for analysis.