CV NCMar 10, 2024

Cracking the neural code for word recognition in convolutional neural networks

arXiv:2403.06159v23.79 citationsh-index: 9PLoS Comput. Biol.

Originality Incremental advance

AI Analysis

This research addresses the problem of understanding invariant word recognition in neural systems, providing mechanistic insights for neuroscience and computational models of reading.

The study investigated how neural circuits achieve invariant word recognition by training deep neural networks to recognize written words and analyzing the emergence of reading-specialized units. It found that these units act as 'space bigrams', sensitive to specific letter identities and their distances from word boundaries, enabling invariant recognition and leading to predictions for reading behavior and neurophysiology.

Learning to read places a strong challenge on the visual system. Years of expertise lead to a remarkable capacity to separate highly similar letters and encode their relative positions, thus distinguishing words such as FORM and FROM, invariantly over a large range of sizes and absolute positions. How neural circuits achieve invariant word recognition remains unknown. Here, we address this issue by training deep neural network models to recognize written words and then analyzing how reading-specialized units emerge and operate across different layers of the network. With literacy, a small subset of units becomes specialized for word recognition in the learned script, similar to the "visual word form area" of the human brain. We show that these units are sensitive to specific letter identities and their distance from the blank space at the left or right of a word, thus acting as "space bigrams". These units specifically encode ordinal positions and operate by pooling across low and high-frequency detector units from early layers of the network. The proposed neural code provides a mechanistic insight into how information on letter identity and position is extracted and allow for invariant word recognition, and leads to predictions for reading behavior, error patterns, and the neurophysiology of reading.

View on arXiv PDF

Similar