Convex Polytope Trees
This work addresses the trade-off between accuracy and interpretability in decision trees for machine learning practitioners, though it appears incremental as it generalizes existing decision tree boundaries.
The paper tackles the problem of decision trees requiring many nodes for high accuracy, which reduces interpretability, by proposing convex polytope trees (CPT) that use convex polytopes as splitting functions, and empirically shows CPT outperforms state-of-the-art decision trees in real-world classification and regression tasks.
A decision tree is commonly restricted to use a single hyperplane to split the covariate space at each of its internal nodes. It often requires a large number of nodes to achieve high accuracy, hurting its interpretability. In this paper, we propose convex polytope trees (CPT) to expand the family of decision trees by an interpretable generalization of their decision boundary. The splitting function at each node of CPT is based on the logical disjunction of a community of differently weighted probabilistic linear decision-makers, which also geometrically corresponds to a convex polytope in the covariate space. We use a nonparametric Bayesian prior at each node to infer the community's size, encouraging simpler decision boundaries by shrinking the number of polytope facets. We develop a greedy method to efficiently construct CPT and scalable end-to-end training algorithms for the tree parameters when the tree structure is given. We empirically demonstrate the efficiency of CPT over existing state-of-the-art decision trees in several real-world classification and regression tasks from diverse domains.