SORTeD Rashomon Sets of Sparse Decision Trees: Anytime Enumeration
This work addresses the problem of enhancing interpretability and flexibility in high-stakes applications for machine learning practitioners by enabling efficient exploration of multiple accurate trees without hard-coding criteria like fairness.
The paper tackles the challenge of enumerating Rashomon sets of sparse decision trees, which are trees with similar performance but varying structures, by introducing SORTeD, a framework that improves scalability and offers anytime behavior, reducing runtime by up to two orders of magnitude compared to state-of-the-art methods.
Sparse decision tree learning provides accurate and interpretable predictive models that are ideal for high-stakes applications by finding the single most accurate tree within a (soft) size limit. Rather than relying on a single "best" tree, Rashomon sets-trees with similar performance but varying structures-can be used to enhance variable importance analysis, enrich explanations, and enable users to choose simpler trees or those that satisfy stakeholder preferences (e.g., fairness) without hard-coding such criteria into the objective function. However, because finding the optimal tree is NP-hard, enumerating the Rashomon set is inherently challenging. Therefore, we introduce SORTD, a novel framework that improves scalability and enumerates trees in the Rashomon set in order of the objective value, thus offering anytime behavior. Our experiments show that SORTD reduces runtime by up to two orders of magnitude compared with the state of the art. Moreover, SORTD can compute Rashomon sets for any separable and totally ordered objective and supports post-evaluating the set using other separable (and partially ordered) objectives. Together, these advances make exploring Rashomon sets more practical in real-world applications.