Minimum Description Length Recurrent Neural Networks
This work addresses the challenge of creating transparent and generalizable neural networks for complex memory tasks, offering a novel approach to connectionist models.
The authors tackled the problem of training neural networks to balance complexity and accuracy using Minimum Description Length, achieving 100% accuracy on memory-intensive tasks and formal languages like a^nb^n and addition, with formal proofs of generalization.
We train neural networks to optimize a Minimum Description Length score, i.e., to balance between the complexity of the network and its accuracy at a task. We show that networks optimizing this objective function master tasks involving memory challenges and go beyond context-free languages. These learners master languages such as $a^nb^n$, $a^nb^nc^n$, $a^nb^{2n}$, $a^nb^mc^{n+m}$, and they perform addition. Moreover, they often do so with 100% accuracy. The networks are small, and their inner workings are transparent. We thus provide formal proofs that their perfect accuracy holds not only on a given test set, but for any input sequence. To our knowledge, no other connectionist model has been shown to capture the underlying grammars for these languages in full generality.