Random Tessellation Forests
This work addresses the need for more flexible partitioning methods in machine learning for multi-dimensional data, offering a novel framework that can capture complex inter-dimensional dependencies, though it builds incrementally on prior generalizations.
The authors tackled the problem of limited flexibility in space partitioning methods due to axis-aligned cuts by proposing the Random Tessellation Process (RTP), a framework that generalizes existing processes to allow non-axis aligned cuts in multi-dimensional data, resulting in improved accuracies in simulations and gene expression analysis.
Space partitioning methods such as random forests and the Mondrian process are powerful machine learning methods for multi-dimensional and relational data, and are based on recursively cutting a domain. The flexibility of these methods is often limited by the requirement that the cuts be axis aligned. The Ostomachion process and the self-consistent binary space partitioning-tree process were recently introduced as generalizations of the Mondrian process for space partitioning with non-axis aligned cuts in the two dimensional plane. Motivated by the need for a multi-dimensional partitioning tree with non-axis aligned cuts, we propose the Random Tessellation Process (RTP), a framework that includes the Mondrian process and the binary space partitioning-tree process as special cases. We derive a sequential Monte Carlo algorithm for inference, and provide random forest methods. Our process is self-consistent and can relax axis-aligned constraints, allowing complex inter-dimensional dependence to be captured. We present a simulation study, and analyse gene expression data of brain tissue, showing improved accuracies over other methods.