Coloring black boxes: visualization of neural network decisions
This addresses the interpretability issue for users of neural networks, though it is incremental as it builds on existing visualization techniques.
The paper tackles the problem of neural networks being black boxes by proposing a visualization method that projects training vectors onto polygon vertices to illustrate network functions, enabling analysis of learning dynamics, network comparisons, and classification confidence estimation, as demonstrated on Wine and Satimage datasets.
Neural networks are commonly regarded as black boxes performing incomprehensible functions. For classification problems networks provide maps from high dimensional feature space to K-dimensional image space. Images of training vector are projected on polygon vertices, providing visualization of network function. Such visualization may show the dynamics of learning, allow for comparison of different networks, display training vectors around which potential problems may arise, show differences due to regularization and optimization procedures, investigate stability of network classification under perturbation of original vectors, and place new data sample in relation to training data, allowing for estimation of confidence in classification of a given sample. An illustrative example for the three-class Wine data and five-class Satimage data is described. The visualization method proposed here is applicable to any black box system that provides continuous outputs.