CVAug 26, 2019

Non-local Recurrent Neural Memory for Supervised Sequence Modeling

arXiv:1908.09535v112 citations
AI Analysis

This addresses a key bottleneck in sequence modeling for applications like video analysis and NLP, though it appears incremental as it builds on existing recurrent architectures.

The paper tackles the limitation of recurrent neural networks in modeling long-range temporal dependencies by proposing Non-local Recurrent Neural Memory (NRNM), which captures high-order interactions between nonadjacent time steps and shows improved performance on action recognition and sentiment analysis tasks.

Typical methods for supervised sequence modeling are built upon the recurrent neural networks to capture temporal dependencies. One potential limitation of these methods is that they only model explicitly information interactions between adjacent time steps in a sequence, hence the high-order interactions between nonadjacent time steps are not fully exploited. It greatly limits the capability of modeling the long-range temporal dependencies since one-order interactions cannot be maintained for a long term due to information dilution and gradient vanishing. To tackle this limitation, we propose the Non-local Recurrent Neural Memory (NRNM) for supervised sequence modeling, which performs non-local operations to learn full-order interactions within a sliding temporal block and models global interactions between blocks in a gated recurrent manner. Consequently, our model is able to capture the long-range dependencies. Besides, the latent high-level features contained in high-order interactions can be distilled by our model. We demonstrate the merits of our NRNM on two different tasks: action recognition and sentiment analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes