NE AI LGMar 27, 2023

Exposing the Functionalities of Neurons for Gated Recurrent Unit Based Sequence-to-Sequence Model

Yi-Ting Lee, Da-Yi Wu, Chih-Chun Yang, Shou-De Lin

arXiv:2303.15072v12.71 citationsh-index: 6

Originality Incremental advance

AI Analysis

This provides insights into the internal mechanisms of RNN models for researchers in natural language processing and neural network interpretability, but it is incremental as it builds on existing Seq2Seq frameworks.

The paper tackled the challenge of analyzing neuron-level behavior in GRU-based sequence-to-sequence models without attention, discovering four neuron types (storing, counting, triggering, outputting) that work together to achieve token positioning.

The goal of this paper is to report certain scientific discoveries about a Seq2Seq model. It is known that analyzing the behavior of RNN-based models at the neuron level is considered a more challenging task than analyzing a DNN or CNN models due to their recursive mechanism in nature. This paper aims to provide neuron-level analysis to explain why a vanilla GRU-based Seq2Seq model without attention can achieve token-positioning. We found four different types of neurons: storing, counting, triggering, and outputting and further uncover the mechanism for these neurons to work together in order to produce the right token in the right position.

View on arXiv PDF

Similar