Temporal Analysis of Language through Neural Language Models
This provides a method for linguists and historians to automatically track semantic shifts in language over time, though it is incremental as it applies existing neural language models to a new temporal analysis task.
The authors tackled the problem of detecting language change over time by training a neural language model on the Google Books Ngram corpus from 1900 to 2009, identifying words like 'cell' and 'gay' that changed significantly and pinpointing the specific years of change.
We provide a method for automatically detecting change in language across time through a chronologically trained neural language model. We train the model on the Google Books Ngram corpus to obtain word vector representations specific to each year, and identify words that have changed significantly from 1900 to 2009. The model identifies words such as "cell" and "gay" as having changed during that time period. The model simultaneously identifies the specific years during which such words underwent change.