LGMay 26, 2014

Visualizing Random Forest with Self-Organising Map

arXiv:1405.6684v117 citations
Originality Incremental advance
AI Analysis

This work addresses interpretability for users of Random Forest in classification tasks, but it is incremental as it builds on existing visualization techniques.

The authors tackled the problem of interpreting Random Forest models by proposing a novel visualization method using Self-Organising Maps (SOM) to reveal intrinsic data relationships, resulting in improved classification accuracy for SOM compared to using Euclidean distance.

Random Forest (RF) is a powerful ensemble method for classification and regression tasks. It consists of decision trees set. Although, a single tree is well interpretable for human, the ensemble of trees is a black-box model. The popular technique to look inside the RF model is to visualize a RF proximity matrix obtained on data samples with Multidimensional Scaling (MDS) method. Herein, we present a novel method based on Self-Organising Maps (SOM) for revealing intrinsic relationships in data that lay inside the RF used for classification tasks. We propose an algorithm to learn the SOM with the proximity matrix obtained from the RF. The visualization of RF proximity matrix with MDS and SOM is compared. What is more, the SOM learned with the RF proximity matrix has better classification accuracy in comparison to SOM learned with Euclidean distance. Presented approach enables better understanding of the RF and additionally improves accuracy of the SOM.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes