Explicating feature contribution using Random Forest proximity distances
This provides a method for explicating feature contributions in Random Forests, which is incremental as it builds on existing proximity distance concepts.
The paper tackles the problem of interpreting black-box decisions in Random Forests by using proximity distances to determine each feature's independent contribution to classification, enabling explication, auditing, and error analysis.
In Random Forests, proximity distances are a metric representation of data into decision space. By observing how changes in input map to the movement of instances in this space we are able to determine the independent contribution of each feature to the decision-making process. For binary feature vectors, this process is fully specified. As these changes in input move particular instances nearer to the in-group or out-group, the independent contribution of each feature can be uncovered. Using this technique, we are able to calculate the contribution of each feature in determining how black-box decisions were made. This allows explication of the decision-making process, audit of the classifier, and post-hoc analysis of errors in classification.