MLCLMar 23, 2017

Dynamic Bernoulli Embeddings for Language Evolution

arXiv:1703.08052v135 citations
AI Analysis

This addresses the challenge of tracking language evolution for researchers in computational linguistics and digital humanities, though it appears incremental as it builds on existing exponential family embeddings.

The paper tackles the problem of modeling how word meanings change over time by developing dynamic embeddings based on a probabilistic framework, and finds that they provide better fits than classical embeddings when applied to historical texts like U.S. Senate speeches and academic abstracts.

Word embeddings are a powerful approach for unsupervised analysis of language. Recently, Rudolph et al. (2016) developed exponential family embeddings, which cast word embeddings in a probabilistic framework. Here, we develop dynamic embeddings, building on exponential family embeddings to capture how the meanings of words change over time. We use dynamic embeddings to analyze three large collections of historical texts: the U.S. Senate speeches from 1858 to 2009, the history of computer science ACM abstracts from 1951 to 2014, and machine learning papers on the Arxiv from 2007 to 2015. We find dynamic embeddings provide better fits than classical embeddings and capture interesting patterns about how language changes.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes