MLLGAug 29, 2016

Optimizing Recurrent Neural Networks Architectures under Time Constraints

arXiv:1608.07892v32 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficiently designing RNN architectures for practitioners, though it appears incremental as it builds on existing optimization techniques.

The authors tackled the problem of optimizing recurrent neural network (RNN) architectures under time constraints by proposing algorithms to adjust hidden sizes, resulting in models that are more accurate or faster than manually tuned state-of-the-art and random search methods.

Recurrent neural network (RNN)'s architecture is a key factor influencing its performance. We propose algorithms to optimize hidden sizes under running time constraint. We convert the discrete optimization into a subset selection problem. By novel transformations, the objective function becomes submodular and constraint becomes supermodular. A greedy algorithm with bounds is suggested to solve the transformed problem. And we show how transformations influence the bounds. To speed up optimization, surrogate functions are proposed which balance exploration and exploitation. Experiments show that our algorithms can find more accurate models or faster models than manually tuned state-of-the-art and random search. We also compare popular RNN architectures using our algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes