LGMLDec 6, 2018

Layer Flexible Adaptive Computational Time

arXiv:1812.02335v51 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of adaptive network structure for sequence modeling, which is incremental as it builds upon existing adaptive computation time methods.

The authors tackled the problem of determining the optimal number of layers in deep recurrent neural networks for sequence data by proposing a layer flexible model with adaptive computation time, achieving performance improvements of 7% to 12% on financial and Wikipedia language modeling datasets.

Deep recurrent neural networks perform well on sequence data and are the model of choice. However, it is a daunting task to decide the structure of the networks, i.e. the number of layers, especially considering different computational needs of a sequence. We propose a layer flexible recurrent neural network with adaptive computation time, and expand it to a sequence to sequence model. Different from the adaptive computation time model, our model has a dynamic number of transmission states which vary by step and sequence. We evaluate the model on a financial data set and Wikipedia language modeling. Experimental results show the performance improvement of 7\% to 12\% and indicate the model's ability to dynamically change the number of layers along with the computational steps.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes