CLAIDec 9, 2024

Annotations for Exploring Food Tweets From Multiple Aspects

arXiv:2412.06179v182 citationsh-index: 18LREC
Originality Synthesis-oriented
AI Analysis

This work provides annotated data for analyzing food-related tweets, but it is incremental as it builds on an existing corpus.

The researchers expanded the Latvian Twitter Eater Corpus (LTEC) by adding manually annotated subsets for tasks like machine translation and sentiment analysis, and they tested baseline models on these datasets to identify future challenges.

This research builds upon the Latvian Twitter Eater Corpus (LTEC), which is focused on the narrow domain of tweets related to food, drinks, eating and drinking. LTEC has been collected for more than 12 years and reaching almost 3 million tweets with the basic information as well as extended automatically and manually annotated metadata. In this paper we supplement the LTEC with manually annotated subsets of evaluation data for machine translation, named entity recognition, timeline-balanced sentiment analysis, and text-image relation classification. We experiment with each of the data sets using baseline models and highlight future challenges for various modelling approaches.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes