CLJun 25, 2024Code
Unmasking the Imposters: How Censorship and Domain Adaptation Affect the Detection of Machine-Generated TweetsBryan E. Tuck, Rakesh M. Verma
The rapid development of large language models (LLMs) has significantly improved the generation of fluent and convincing text, raising concerns about their potential misuse on social media platforms. We present a comprehensive methodology for creating nine Twitter datasets to examine the generative capabilities of four prominent LLMs: Llama 3, Mistral, Qwen2, and GPT4o. These datasets encompass four censored and five uncensored model configurations, including 7B and 8B parameter base-instruction models of the three open-source LLMs. Additionally, we perform a data quality analysis to assess the characteristics of textual outputs from human, "censored," and "uncensored" models, employing semantic meaning, lexical richness, structural patterns, content characteristics, and detector performance metrics to identify differences and similarities. Our evaluation demonstrates that "uncensored" models significantly undermine the effectiveness of automated detection methods. This study addresses a critical gap by exploring smaller open-source models and the ramifications of "uncensoring," providing valuable insights into how domain adaptation and content moderation strategies influence both the detectability and structural characteristics of machine-generated text.
CLNov 26, 2025
Orthographic Constraint Satisfaction and Human Difficulty Alignment in Large Language ModelsBryan E. Tuck, Rakesh M. Verma
Large language models must satisfy hard orthographic constraints during controlled text generation, yet systematic cross-architecture evaluation remains limited. We evaluate 28 configurations spanning three model families (Qwen3, Claude Haiku-4.5, GPT-5-mini) on 58 word puzzles requiring character-level constraint satisfaction. Architectural differences produce substantially larger performance gaps (2.0-2.2x, F1=0.761 vs. 0.343) than parameter scaling within families (83% gain from eightfold scaling), suggesting that constraint satisfaction may require specialized architectural features or training objectives beyond standard language model scaling. Thinking budget sensitivity proves heterogeneous: high-capacity models show strong returns (+0.102 to +0.136 F1), while mid-sized variants saturate or degrade. These patterns are inconsistent with uniform compute benefits. Using difficulty ratings from 10,000 human solvers per puzzle, we establish modest but consistent calibration (r=0.24-0.38) across all families, yet identify systematic failures on common words with unusual orthography ("data", "poop", "loll": 86-95% human success, 89-96% model miss rate). These failures reveal over-reliance on distributional plausibility that penalizes orthographically atypical but constraint-valid patterns, suggesting architectural innovations may be required beyond simply scaling parameters or computational budgets.
CLFeb 1, 2024
Domain-Independent Deception: A New Taxonomy and Linguistic AnalysisRakesh M. Verma, Nachum Dershowitz, Victor Zeng et al.
Internet-based economies and societies are drowning in deceptive attacks. These attacks take many forms, such as fake news, phishing, and job scams, which we call ``domains of deception.'' Machine-learning and natural-language-processing researchers have been attempting to ameliorate this precarious situation by designing domain-specific detectors. Only a few recent works have considered domain-independent deception. We collect these disparate threads of research and investigate domain-independent deception. First, we provide a new computational definition of deception and break down deception into a new taxonomy. Then, we analyze the debate on linguistic cues for deception and supply guidelines for systematic reviews. Finally, we investigate common linguistic features and give evidence for knowledge transfer across different forms of deception.
CLMay 7, 2024
A Roadmap for Multilingual, Multimodal Domain Independent Deception DetectionDainis Boumber, Rakesh M. Verma, Fatima Zahra Qachfar
Deception, a prevalent aspect of human communication, has undergone a significant transformation in the digital age. With the globalization of online interactions, individuals are communicating in multiple languages and mixing languages on social media, with varied data becoming available in each language and dialect. At the same time, the techniques for detecting deception are similar across the board. Recent studies have shown the possibility of the existence of universal linguistic cues to deception across domains within the English language; however, the existence of such cues in other languages remains unknown. Furthermore, the practical task of deception detection in low-resource languages is not a well-studied problem due to the lack of labeled data. Another dimension of deception is multimodality. For example, a picture with an altered caption in fake news or disinformation may exist. This paper calls for a comprehensive investigation into the complexities of deceptive language across linguistic boundaries and modalities within the realm of computer security and natural language processing and the possibility of using multilingual transformer models and labeled data in various languages to universally address the task of deception detection.
LGAug 6, 2025
Assessing Representation Stability for Transformer ModelsBryan E. Tuck, Rakesh M. Verma
Adversarial text attacks remain a persistent threat to transformer models, yet existing defenses are typically attack-specific or require costly model retraining. We introduce Representation Stability (RS), a model-agnostic detection framework that identifies adversarial examples by measuring how embedding representations change when important words are masked. RS first ranks words using importance heuristics, then measures embedding sensitivity to masking top-k critical words, and processes the resulting patterns with a BiLSTM detector. Experiments show that adversarially perturbed words exhibit disproportionately high masking sensitivity compared to naturally important words. Across three datasets, three attack types, and two victim models, RS achieves over 88% detection accuracy and demonstrates competitive performance compared to existing state-of-the-art methods, often at lower computational cost. Using Normalized Discounted Cumulative Gain (NDCG) to measure perturbation identification quality, we reveal that gradient-based ranking outperforms attention and random selection approaches, with identification quality correlating with detection performance for word-level attacks. RS also generalizes well to unseen datasets, attacks, and models without retraining, providing a practical solution for adversarial text detection.
CLFeb 28, 2025
Autoencoder-Based Framework to Capture Vocabulary Quality in NLPVu Minh Hoang Dang, Rakesh M. Verma
Linguistic richness is essential for advancing natural language processing (NLP), as dataset characteristics often directly influence model performance. However, traditional metrics such as Type-Token Ratio (TTR), Vocabulary Diversity (VOCD), and Measure of Lexical Text Diversity (MTLD) do not adequately capture contextual relationships, semantic richness, and structural complexity. In this paper, we introduce an autoencoder-based framework that uses neural network capacity as a proxy for vocabulary richness, diversity, and complexity, enabling a dynamic assessment of the interplay between vocabulary size, sentence structure, and contextual depth. We validate our approach on two distinct datasets: the DIFrauD dataset, which spans multiple domains of deceptive and fraudulent text, and the Project Gutenberg dataset, representing diverse languages, genres, and historical periods. Experimental results highlight the robustness and adaptability of our method, offering practical guidance for dataset curation and NLP model design. By enhancing traditional vocabulary evaluation, our work fosters the development of more context-aware, linguistically adaptive NLP systems.
CLJun 28, 2024
The Pitfalls of Publishing in the Age of LLMs: Strange and Surprising Adventures with a High-Impact NLP JournalRakesh M. Verma, Nachum Dershowitz
We show the fraught side of the academic publishing realm and illustrate it through a recent case study with an NLP journal.
CLFeb 5, 2024
Homograph Attacks on Maghreb Sentiment AnalyzersFatima Zahra Qachfar, Rakesh M. Verma
We examine the impact of homograph attacks on the Sentiment Analysis (SA) task of different Arabic dialects from the Maghreb North-African countries. Homograph attacks result in a 65.3% decrease in transformer classification from an F1-score of 0.95 to 0.33 when data is written in "Arabizi". The goal of this study is to highlight LLMs weaknesses' and to prioritize ethical and responsible Machine Learning.
LGMar 14, 2021
Claim Verification using a Multi-GAN based ModelAmartya Hatua, Arjun Mukherjee, Rakesh M. Verma
This article describes research on claim verification carried out using a multiple GAN-based model. The proposed model consists of three pairs of generators and discriminators. The generator and discriminator pairs are responsible for generating synthetic data for supported and refuted claims and claim labels. A theoretical discussion about the proposed model is provided to validate the equilibrium state of the model. The proposed model is applied to the FEVER dataset, and a pre-trained language model is used for the input text data. The synthetically generated data helps to gain information which helps the model to perform better than state of the art models and other standard classifiers.
CLJul 14, 2020
Modeling Coherency in Generated Emails by Leveraging Deep Neural LearnersAvisha Das, Rakesh M. Verma
Advanced machine learning and natural language techniques enable attackers to launch sophisticated and targeted social engineering-based attacks. To counter the active attacker issue, researchers have since resorted to proactive methods of detection. Email masquerading using targeted emails to fool the victim is an advanced attack method. However automatic text generation requires controlling the context and coherency of the generated content, which has been identified as an increasingly difficult problem. The method used leverages a hierarchical deep neural model which uses a learned representation of the sentences in the input document to generate structured written emails. We demonstrate the generation of short and targeted text messages using the deep model. The global coherency of the synthesized text is evaluated using a qualitative study as well as multiple quantitative measures.
CRJun 24, 2020
Less is More: Exploiting Social Trust to Increase the Effectiveness of a Deception AttackShahryar Baki, Rakesh M. Verma, Arjun Mukherjee et al.
Cyber attacks such as phishing, IRS scams, etc., still are successful in fooling Internet users. Users are the last line of defense against these attacks since attackers seem to always find a way to bypass security systems. Understanding users' reason about the scams and frauds can help security providers to improve users security hygiene practices. In this work, we study the users' reasoning and the effectiveness of several variables within the context of the company representative fraud. Some of the variables that we study are: 1) the effect of using LinkedIn as a medium for delivering the phishing message instead of using email, 2) the effectiveness of natural language generation techniques in generating phishing emails, and 3) how some simple customizations, e.g., adding sender's contact info to the email, affect participants perception. The results obtained from the within-subject study show that participants are not prepared even for a well-known attack - company representative fraud. Findings include: approximately 65% mean detection rate and insights into how the success rate changes with the facade and correspondent (sender/receiver) information. A significant finding is that a smaller set of well-chosen strategies is better than a large `mess' of strategies. We also find significant differences in how males and females approach the same company representative fraud. Insights from our work could help defenders in developing better strategies to evaluate their defenses and in devising better training strategies.