LGNCSep 7, 2023

Brief technical note on linearizing recurrent neural networks (RNNs) before vs after the pointwise nonlinearity

arXiv:2309.04030v15 citationsh-index: 29
Originality Synthesis-oriented
AI Analysis

This is an incremental technical clarification for researchers studying RNN dynamics.

The paper tackles the difference between linearizing RNNs before versus after the pointwise nonlinearity, showing that these two linearizations yield distinct dynamics matrices and eigenvectors, with context-dependent effects more apparent in activity-based linearization.

Linearization of the dynamics of recurrent neural networks (RNNs) is often used to study their properties. The same RNN dynamics can be written in terms of the ``activations" (the net inputs to each unit, before its pointwise nonlinearity) or in terms of the ``activities" (the output of each unit, after its pointwise nonlinearity); the two corresponding linearizations are different from each other. This brief and informal technical note describes the relationship between the two linearizations, between the left and right eigenvectors of their dynamics matrices, and shows that some context-dependent effects are readily apparent under linearization of activity dynamics but not linearization of activation dynamics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes