CLJul 10, 2020

What Can We Learn From Almost a Decade of Food Tweets

arXiv:2007.05194v20.713 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work provides a resource for researchers in natural language processing focusing on food-related social media analysis, but it is incremental as it applies existing methods to a new dataset.

The authors tackled the problem of analyzing food-related social media data by creating the Latvian Twitter Eater Corpus, a dataset of over 2 million tweets collected over 8 years, and demonstrated its utility by training domain-specific question-answering and sentiment-analysis models.

We present the Latvian Twitter Eater Corpus - a set of tweets in the narrow domain related to food, drinks, eating and drinking. The corpus has been collected over time-span of over 8 years and includes over 2 million tweets entailed with additional useful data. We also separate two sub-corpora of question and answer tweets and sentiment annotated tweets. We analyse contents of the corpus and demonstrate use-cases for the sub-corpora by training domain-specific question-answering and sentiment-analysis models using data from the corpus.

View on arXiv PDF Code

Similar