CLAIJul 28, 2022

Unit Testing for Concepts in Neural Networks

arXiv:2208.10244v2308 citationsh-index: 38
Originality Synthesis-oriented
AI Analysis

This work addresses the interpretability and theoretical consistency of neural networks for researchers in AI and cognitive science, but it is incremental as it builds on existing theories and methods.

The paper tackled the problem of evaluating whether neural networks align with symbolic theories of concepts, specifically Fodor's criteria, by proposing unit tests and applying them to modern architectures on a visual concept learning task. The result was that models succeeded on tests of groundedness, modularity, and reusability, but questions about causality remained unresolved.

Many complex problems are naturally understood in terms of symbolic concepts. For example, our concept of "cat" is related to our concepts of "ears" and "whiskers" in a non-arbitrary way. Fodor (1998) proposes one theory of concepts, which emphasizes symbolic representations related via constituency structures. Whether neural networks are consistent with such a theory is open for debate. We propose unit tests for evaluating whether a system's behavior is consistent with several key aspects of Fodor's criteria. Using a simple visual concept learning task, we evaluate several modern neural architectures against this specification. We find that models succeed on tests of groundedness, modularlity, and reusability of concepts, but that important questions about causality remain open. Resolving these will require new methods for analyzing models' internal states.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes