LGMay 9, 2022

Visualization of Decision Trees based on General Line Coordinates to Support Explainable Models

arXiv:2205.04035v14 citationsh-index: 6
Originality Synthesis-oriented
AI Analysis

This is an incremental method for domain experts to evaluate and improve decision tree models, helping avoid overgeneralization and overfitting.

The paper tackles the problem of interpreting decision tree models by proposing SPC-DT, a visualization method using Shifted Paired Coordinates to show attributes, cases, data flow, split tightness, and density, demonstrated on three real datasets.

Visualization of Machine Learning (ML) models is an important part of the ML process to enhance the interpretability and prediction accuracy of the ML models. This paper proposes a new method SPC-DT to visualize the Decision Tree (DT) as interpretable models. These methods use a version of General Line Coordinates called Shifted Paired Coordinates (SPC). In SPC, each n-D point is visualized in a set of shifted pairs of 2-D Cartesian coordinates as a directed graph. The new method expands and complements the capabilities of existing methods, to visualize DT models. It shows: (1) relations between attributes, (2) individual cases relative to the DT structure, (3) data flow in the DT, (4) how tight each split is to thresholds in the DT nodes, and (5) the density of cases in parts of the n-D space. This information is important for domain experts for evaluating and improving the DT models, including avoiding overgeneralization and overfitting of models, along with their performance. The benefits of the methods are demonstrated in the case studies, using three real datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes