NELGNCApr 2, 2020

Under the Hood of Neural Networks: Characterizing Learned Representations by Functional Neuron Populations and Network Ablations

arXiv:2004.01254v222 citations
Originality Synthesis-oriented
AI Analysis

This work addresses interpretability for safety-critical applications like autonomous driving and medical diagnostics, but it is incremental as it builds on existing neuroscience-inspired methods.

The paper tackles the lack of transparency in neural networks by characterizing learned representations using activation patterns and network ablations, revealing that individual neuron importance cannot be determined by activation magnitude, selectivity, or performance impact alone.

The need for more transparency of the decision-making processes in artificial neural networks steadily increases driven by their applications in safety critical and ethically challenging domains such as autonomous driving or medical diagnostics. We address today's lack of transparency of neural networks and shed light on the roles of single neurons and groups of neurons within the network fulfilling a learned task. Inspired by research in the field of neuroscience, we characterize the learned representations by activation patterns and network ablations, revealing functional neuron populations that a) act jointly in response to specific stimuli or b) have similar impact on the network's performance after being ablated. We find that neither a neuron's magnitude or selectivity of activation, nor its impact on network performance are sufficient stand-alone indicators for its importance for the overall task. We argue that such indicators are essential for future advances in transfer learning and modern neuroscience.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes