Rank Projection Trees for Multilevel Neural Network Interpretation
This addresses the need for multiscale interpretation in fields like biology, where understanding interactions between groups of inputs (e.g., genes) is crucial, though it is incremental as it builds on prior scoring methods.
The paper tackles the problem of interpreting neural networks by identifying important subsets of inputs across multiple scales, rather than just individual inputs, using a flexible framework that works with existing scoring functions. It demonstrates the method's ability to identify biologically significant genes and gene sets in cancer and psychiatric genomics.
A variety of methods have been proposed for interpreting nodes in deep neural networks, which typically involve scoring nodes at lower layers with respect to their effects on the output of higher-layer nodes (where lower and higher layers are closer to the input and output layers, respectively). However, we may be interested in picking out a prioritized collection of subsets of the inputs across a range of scales according to their importance for an output node, and not simply a prioritized ranking across the inputs as singletons. Such a situation may arise in biological applications, for instance, where we are interested in epistatic effects between groups of genes in determining a trait of interest. Here, we outline a flexible framework which may be used to generate multiscale network interpretations, using any previously defined scoring function. We demonstrate the ability of our method to pick out biologically important genes and gene sets in the domains of cancer and psychiatric genomics.