A Neural Model for User Geolocation and Lexical Dialectology
This work addresses geolocation and dialect analysis for social media users, representing an incremental improvement with a simple neural approach.
The authors tackled the problem of text-based user geolocation and dialect detection by proposing a neural network model that achieves state-of-the-art performance on three Twitter benchmark datasets, with the hidden layer embeddings used to detect dialectal terms and the release of a new dataset, DAREDS, for evaluation.
We propose a simple yet effective text- based user geolocation model based on a neural network with one hidden layer, which achieves state of the art performance over three Twitter benchmark geolocation datasets, in addition to producing word and phrase embeddings in the hidden layer that we show to be useful for detecting dialectal terms. As part of our analysis of dialectal terms, we release DAREDS, a dataset for evaluating dialect term detection methods.