Assessing Capabilities of Large Language Models in Social Media Analytics: A Multi-task Quest

Ramtin Davoudi, Kartik Thakkar, Nazanin Donyapour, Tyler Derr, Hamid Karimi

arXiv:2604.1895585.2h-index: 4

Predicted impact top 48% in CL · last 90 daysOriginality Incremental advance

AI Analysis

For social media researchers and practitioners, this work offers the first unified benchmark for LLM performance on core analytics tasks, but the evaluation is limited to a single Twitter dataset and may not generalize.

This study provides the first comprehensive evaluation of modern LLMs (GPT-4, GPT-4o, etc.) across three social media analytics tasks: authorship verification, post generation, and user attribute inference. Results establish reproducible benchmarks and show that LLMs can generate authentic user-like content, with GPT-4 achieving the best performance in most tasks.

In this study, we present the first comprehensive evaluation of modern LLMs - including GPT-4, GPT-4o, GPT-3.5-Turbo, Gemini 1.5 Pro, DeepSeek-V3, Llama 3.2, and BERT - across three core social media analytics tasks on a Twitter (X) dataset: (I) Social Media Authorship Verification, (II) Social Media Post Generation, and (III) User Attribute Inference. For the authorship verification, we introduce a systematic sampling framework over diverse user and post selection strategies and evaluate generalization on newly collected tweets from January 2024 onward to mitigate "seen-data" bias. For post generation, we assess the ability of LLMs to produce authentic, user-like content using comprehensive evaluation metrics. Bridging Tasks I and II, we conduct a user study to measure real users' perceptions of LLM-generated posts conditioned on their own writing. For attribute inference, we annotate occupations and interests using two standardized taxonomies (IAB Tech Lab 2023 and 2018 U.S. SOC) and benchmark LLMs against existing baselines. Overall, our unified evaluation provides new insights and establishes reproducible benchmarks for LLM-driven social media analytics. The code and data are provided in the supplementary material and will also be made publicly available upon publication.

View on arXiv PDF

Similar