Recurrent Independent Mechanisms
This addresses the challenge of building modular structures that reflect environmental dynamics for better AI robustness, though it appears incremental as it builds on existing recurrent and attention mechanisms.
The paper tackles the problem of improving generalization and robustness in recurrent architectures by introducing Recurrent Independent Mechanisms (RIMs), which use nearly independent transition dynamics and sparse attention-based communication, resulting in dramatically improved generalization on tasks with systematic variations between training and evaluation.
Learning modular structures which reflect the dynamics of the environment can lead to better generalization and robustness to changes which only affect a few of the underlying causes. We propose Recurrent Independent Mechanisms (RIMs), a new recurrent architecture in which multiple groups of recurrent cells operate with nearly independent transition dynamics, communicate only sparingly through the bottleneck of attention, and are only updated at time steps where they are most relevant. We show that this leads to specialization amongst the RIMs, which in turn allows for dramatically improved generalization on tasks where some factors of variation differ systematically between training and evaluation.