LGCLMLOct 23, 2019

A Unifying Framework of Bilinear LSTMs

arXiv:1910.10294v2
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in sequence modeling for NLP applications, offering a method to enhance expressivity efficiently, though it appears incremental as it builds on existing LSTM frameworks.

The paper tackles the problem of improving LSTM performance by incorporating nonlinear feature interactions without increasing parameter count, achieving superior results over linear LSTMs in language-based sequence tasks.

This paper presents a novel unifying framework of bilinear LSTMs that can represent and utilize the nonlinear interaction of the input features present in sequence datasets for achieving superior performance over a linear LSTM and yet not incur more parameters to be learned. To realize this, our unifying framework allows the expressivity of the linear vs. bilinear terms to be balanced by correspondingly trading off between the hidden state vector size vs. approximation quality of the weight matrix in the bilinear term so as to optimize the performance of our bilinear LSTM, while not incurring more parameters to be learned. We empirically evaluate the performance of our bilinear LSTM in several language-based sequence learning tasks to demonstrate its general applicability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes