CVCLSep 26, 2017

Tensor Product Generation Networks for Deep NLP Modeling

arXiv:1709.09118v51105 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of interpretability and performance in deep NLP models for tasks like image captioning, though it is incremental as it builds on existing Tensor Product Representations.

The paper tackles the problem of designing deep networks for natural language processing by proposing Tensor Product Generation Networks (TPGN), which use Tensor Product Representations to encode symbol structures and enable interpretable internal representations. In an image-caption generation model on the COCO dataset, TPGN outperforms LSTM baselines, with internal representations showing considerable grammatical content.

We present a new approach to the design of deep networks for natural language processing (NLP), based on the general technique of Tensor Product Representations (TPRs) for encoding and processing symbol structures in distributed neural networks. A network architecture --- the Tensor Product Generation Network (TPGN) --- is proposed which is capable in principle of carrying out TPR computation, but which uses unconstrained deep learning to design its internal representations. Instantiated in a model for image-caption generation, TPGN outperforms LSTM baselines when evaluated on the COCO dataset. The TPR-capable structure enables interpretation of internal representations and operations, which prove to contain considerable grammatical content. Our caption-generation model can be interpreted as generating sequences of grammatical categories and retrieving words by their categories from a plan encoded as a distributed representation.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes