TimberTrek: Exploring and Curating Sparse Decision Trees with Interactive Visualization
This addresses the challenge for ML practitioners in choosing interpretable models from large sets, though it is incremental as it builds on existing techniques for generating Rashomon sets.
The paper tackles the problem of selecting among thousands of equally accurate sparse decision tree models by developing TimberTrek, an interactive visualization system that helps users explore and curate models based on domain knowledge, with the tool being open-source and running in computational notebooks and web browsers.
Given thousands of equally accurate machine learning (ML) models, how can users choose among them? A recent ML technique enables domain experts and data scientists to generate a complete Rashomon set for sparse decision trees--a huge set of almost-optimal interpretable ML models. To help ML practitioners identify models with desirable properties from this Rashomon set, we develop TimberTrek, the first interactive visualization system that summarizes thousands of sparse decision trees at scale. Two usage scenarios highlight how TimberTrek can empower users to easily explore, compare, and curate models that align with their domain knowledge and values. Our open-source tool runs directly in users' computational notebooks and web browsers, lowering the barrier to creating more responsible ML models. TimberTrek is available at the following public demo link: https://poloclub.github.io/timbertrek.