Neuromodulation Gated Transformer
This work addresses performance enhancement for transformer models in natural language processing tasks, but it appears incremental as it builds on existing transformer architectures with a neuromodulation gating mechanism.
The authors tackled the problem of improving transformer performance by introducing the Neuromodulation Gated Transformer (NGT), a simple architecture that implements neuromodulation via a multiplicative effect, and they showed it achieves the best average performance on the SuperGLUE benchmark validation sets.
We introduce a novel architecture, the Neuromodulation Gated Transformer (NGT), which is a simple implementation of neuromodulation in transformers via a multiplicative effect. We compare it to baselines and show that it results in the best average performance on the SuperGLUE benchmark validation sets.