LG CLApr 16, 2021

Probing artificial neural networks: insights from neuroscience

Anna A. Ivanova, John Hewitt, Noga Zaslavsky

arXiv:2104.08197v114.121 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of improving interpretability tools for researchers in machine learning and neuroscience, but it is incremental as it builds on existing probing methods without introducing new paradigms.

The paper tackles the challenge of understanding complex information processing systems by drawing insights from neuroscience to guide probing research in machine learning, highlighting two key design choices—direction and expressivity—and emphasizing the importance of explicit research goals in probe design.

A major challenge in both neuroscience and machine learning is the development of useful tools for understanding complex information processing systems. One such tool is probes, i.e., supervised models that relate features of interest to activation patterns arising in biological or artificial neural networks. Neuroscience has paved the way in using such models through numerous studies conducted in recent decades. In this work, we draw insights from neuroscience to help guide probing research in machine learning. We highlight two important design choices for probes $-$ direction and expressivity $-$ and relate these choices to research goals. We argue that specific research goals play a paramount role when designing a probe and encourage future probing studies to be explicit in stating these goals.

View on arXiv PDF

Similar