CLJan 22
ToxiTwitch: Toward Emote-Aware Hybrid Moderation for Live Streaming PlatformsBaktash Ansari, Elias Martin, Afra Mashhadi
The rapid growth of live-streaming platforms such as Twitch has introduced complex challenges in moderating toxic behavior. Traditional moderation approaches, such as human annotation and keyword-based filtering, have demonstrated utility, but human moderators on Twitch constantly struggle to scale effectively in the fast-paced, high-volume, and context-rich chat environment of the platform while also facing harassment themselves. Recent advances in large language models (LLMs), such as DeepSeek-R1-Distill and Llama-3-8B-Instruct, offer new opportunities for toxicity detection, especially in understanding nuanced, multimodal communication involving emotes. In this work, we present an exploratory comparison of toxicity detection approaches tailored to Twitch. Our analysis reveals that incorporating emotes improves the detection of toxic behavior. To this end, we introduce ToxiTwitch, a hybrid model that combines LLM-generated embeddings of text and emotes with traditional machine learning classifiers, including Random Forest and SVM. In our case study, the proposed hybrid approach reaches up to 80 percent accuracy under channel-specific training (with 13 percent improvement over BERT and F1-score of 76 percent). This work is an exploratory study intended to surface challenges and limits of emote-aware toxicity detection on Twitch.
AIFeb 14, 2024
AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe ApproachMaryam Amirizaniani, Elias Martin, Tanya Roosta et al.
As Large Language Models (LLMs) are integrated into various sectors, ensuring their reliability and safety is crucial. This necessitates rigorous probing and auditing to maintain their effectiveness and trustworthiness in practical applications. Subjecting LLMs to varied iterations of a single query can unveil potential inconsistencies in their knowledge base or functional capacity. However, a tool for performing such audits with a easy to execute workflow, and low technical threshold is lacking. In this demo, we introduce ``AuditLLM,'' a novel tool designed to audit the performance of various LLMs in a methodical way. AuditLLM's primary function is to audit a given LLM by deploying multiple probes derived from a single question, thus detecting any inconsistencies in the model's comprehension or performance. A robust, reliable, and consistent LLM is expected to generate semantically similar responses to variably phrased versions of the same question. Building on this premise, AuditLLM generates easily interpretable results that reflect the LLM's consistency based on a single input question provided by the user. A certain level of inconsistency has been shown to be an indicator of potential bias, hallucinations, and other issues. One could then use the output of AuditLLM to further investigate issues with the aforementioned LLM. To facilitate demonstration and practical uses, AuditLLM offers two key modes: (1) Live mode which allows instant auditing of LLMs by analyzing responses to real-time queries; and (2) Batch mode which facilitates comprehensive LLM auditing by processing multiple queries at once for in-depth analysis. This tool is beneficial for both researchers and general users, as it enhances our understanding of LLMs' capabilities in generating responses, using a standardized auditing platform.
CLJun 9, 2024
Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended ResponsesMaryam Amirizaniani, Elias Martin, Maryna Sivachenko et al.
Theory of Mind (ToM) reasoning entails recognizing that other individuals possess their own intentions, emotions, and thoughts, which is vital for guiding one's own thought processes. Although large language models (LLMs) excel in tasks such as summarization, question answering, and translation, they still face challenges with ToM reasoning, especially in open-ended questions. Despite advancements, the extent to which LLMs truly understand ToM reasoning and how closely it aligns with human ToM reasoning remains inadequately explored in open-ended scenarios. Motivated by this gap, we assess the abilities of LLMs to perceive and integrate human intentions and emotions into their ToM reasoning processes within open-ended questions. Our study utilizes posts from Reddit's ChangeMyView platform, which demands nuanced social reasoning to craft persuasive responses. Our analysis, comparing semantic similarity and lexical overlap metrics between responses generated by humans and LLMs, reveals clear disparities in ToM reasoning capabilities in open-ended questions, with even the most advanced models showing notable limitations. To enhance LLM capabilities, we implement a prompt tuning method that incorporates human intentions and emotions, resulting in improvements in ToM reasoning performance. However, despite these improvements, the enhancement still falls short of fully achieving human-like reasoning. This research highlights the deficiencies in LLMs' social reasoning and demonstrates how integrating human intentions and emotions can boost their effectiveness.