LGAINov 20, 2024

Evaluating LLMs Capabilities Towards Understanding Social Dynamics

arXiv:2411.13008v11 citationsh-index: 9ASONAM
Originality Synthesis-oriented
AI Analysis

It addresses the problem of assessing LLMs for social media applications, such as cyberbullying detection, which is important for developers and researchers, but is incremental as it builds on existing evaluation methods.

This paper evaluates large language models (LLMs) on their ability to understand social dynamics in social media, focusing on language, directionality, and bullying/anti-bullying detection, finding mixed results with fine-tuned models showing promise in some tasks but not others.

Social media discourse involves people from different backgrounds, beliefs, and motives. Thus, often such discourse can devolve into toxic interactions. Generative Models, such as Llama and ChatGPT, have recently exploded in popularity due to their capabilities in zero-shot question-answering. Because these models are increasingly being used to ask questions of social significance, a crucial research question is whether they can understand social media dynamics. This work provides a critical analysis regarding generative LLM's ability to understand language and dynamics in social contexts, particularly considering cyberbullying and anti-cyberbullying (posts aimed at reducing cyberbullying) interactions. Specifically, we compare and contrast the capabilities of different large language models (LLMs) to understand three key aspects of social dynamics: language, directionality, and the occurrence of bullying/anti-bullying messages. We found that while fine-tuned LLMs exhibit promising results in some social media understanding tasks (understanding directionality), they presented mixed results in others (proper paraphrasing and bullying/anti-bullying detection). We also found that fine-tuning and prompt engineering mechanisms can have positive effects in some tasks. We believe that a understanding of LLM's capabilities is crucial to design future models that can be effectively used in social applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes