Learning Hierarchical Information Flow with Recurrent Neural Modules
This work addresses the challenge of improving information flow in sequential models for machine learning applications, though it appears incremental as it builds on existing neural network architectures.
The authors tackled the problem of enabling flexible feature sharing over time in sequential data processing by proposing ThalNet, a model with recurrent neural modules and a routing center, which outperformed standard recurrent neural networks on several benchmarks.
We propose ThalNet, a deep learning model inspired by neocortical communication via the thalamus. Our model consists of recurrent neural modules that send features through a routing center, endowing the modules with the flexibility to share features over multiple time steps. We show that our model learns to route information hierarchically, processing input data by a chain of modules. We observe common architectures, such as feed forward neural networks and skip connections, emerging as special cases of our architecture, while novel connectivity patterns are learned for the text8 compression task. Our model outperforms standard recurrent neural networks on several sequential benchmarks.