CLLGJan 31, 2019

A Generalized Language Model in Tensor Space

arXiv:1901.11167v119 citations
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in language modeling for NLP researchers, offering an incremental improvement through higher-order tensor representations.

The paper tackles the limited expressive power of low-order tensors in language modeling by proposing TSLM, a high-order tensor representation that generalizes n-gram models and shows effectiveness on PTB and WikiText benchmarks.

In the literature, tensors have been effectively used for capturing the context information in language models. However, the existing methods usually adopt relatively-low order tensors, which have limited expressive power in modeling language. Developing a higher-order tensor representation is challenging, in terms of deriving an effective solution and showing its generality. In this paper, we propose a language model named Tensor Space Language Model (TSLM), by utilizing tensor networks and tensor decomposition. In TSLM, we build a high-dimensional semantic space constructed by the tensor product of word vectors. Theoretically, we prove that such tensor representation is a generalization of the n-gram language model. We further show that this high-order tensor representation can be decomposed to a recursive calculation of conditional probability for language modeling. The experimental results on Penn Tree Bank (PTB) dataset and WikiText benchmark demonstrate the effectiveness of TSLM.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes