CLJul 20, 2021

TLA: Twitter Linguistic Analysis

arXiv:2107.09710v1

Originality Synthesis-oriented

AI Analysis

This addresses the problem of inefficient data processing for researchers studying linguistic communities on social media, though it appears incremental as it structures existing methods.

The paper tackles the cumbersome process of collecting, labeling, and analyzing Twitter data for linguistic studies by introducing TLA (Twitter Linguistic Analysis), a framework that provides detailed labeled datasets for multiple languages and trains models on them.

Linguistics has been instrumental in developing a deeper understanding of human nature. Words are indispensable to bequeath the thoughts, emotions, and purpose of any human interaction, and critically analyzing these words can elucidate the social and psychological behavior and characteristics of these social animals. Social media has become a platform for human interaction on a large scale and thus gives us scope for collecting and using that data for our study. However, this entire process of collecting, labeling, and analyzing this data iteratively makes the entire procedure cumbersome. To make this entire process easier and structured, we would like to introduce TLA(Twitter Linguistic Analysis). In this paper, we describe TLA and provide a basic understanding of the framework and discuss the process of collecting, labeling, and analyzing data from Twitter for a corpus of languages while providing detailed labeled datasets for all the languages and the models are trained on these datasets. The analysis provided by TLA will also go a long way in understanding the sentiments of different linguistic communities and come up with new and innovative solutions for their problems based on the analysis.

View on arXiv PDF

Similar