LGDec 7, 2023

Invariant Random Forest: Tree-Based Model Solution for OOD Generalization

arXiv:2312.04273v33 citationsh-index: 6AAAI
Originality Incremental advance
AI Analysis

It addresses OOD generalization for tree-based models, a domain-specific problem that is incremental as it extends existing OOD methods from neural networks to decision trees.

The paper tackles out-of-distribution generalization for decision tree models by introducing Invariant Decision Tree and its ensemble version, Invariant Random Forest, which enforce penalties on unstable splits across environments, achieving superior performance compared to non-OOD tree models in synthetic and real datasets.

Out-Of-Distribution (OOD) generalization is an essential topic in machine learning. However, recent research is only focusing on the corresponding methods for neural networks. This paper introduces a novel and effective solution for OOD generalization of decision tree models, named Invariant Decision Tree (IDT). IDT enforces a penalty term with regard to the unstable/varying behavior of a split across different environments during the growth of the tree. Its ensemble version, the Invariant Random Forest (IRF), is constructed. Our proposed method is motivated by a theoretical result under mild conditions, and validated by numerical tests with both synthetic and real datasets. The superior performance compared to non-OOD tree models implies that considering OOD generalization for tree models is absolutely necessary and should be given more attention.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes