Gabi Dreo Rodosek

CL
h-index3
8papers
16citations
Novelty46%
AI Score44

8 Papers

CLOct 25, 2023
How well can machine-generated texts be identified and can language models be trained to avoid identification?

Sinclair Schneider, Florian Steuber, Joao A. G. Schneider et al.

With the rise of generative pre-trained transformer models such as GPT-3, GPT-NeoX, or OPT, distinguishing human-generated texts from machine-generated ones has become important. We refined five separate language models to generate synthetic tweets, uncovering that shallow learning classification algorithms, like Naive Bayes, achieve detection accuracy between 0.6 and 0.8. Shallow learning classifiers differ from human-based detection, especially when using higher temperature values during text generation, resulting in a lower detection rate. Humans prioritize linguistic acceptability, which tends to be higher at lower temperature values. In contrast, transformer-based classifiers have an accuracy of 0.9 and above. We found that using a reinforcement learning approach to refine our generative models can successfully evade BERT-based classifiers with a detection accuracy of 0.15 or less.

1.6CLMay 14
Ideology Prediction of German Political Texts

Sinclair Schneider, Florian Steuber, Joao A. G. Schneider et al.

Elections represent a crucial milestone in a nation's ongoing development. To better understand the political rhetoric from various movements, ranging from left to right, we propose a transformer-based model capable of projecting the political orientation of a text on a continuous left-to-right spectrum, represented by a normalized scalar d between -1 and 1. This approach enables analysts to focus on specific segments of the political landscape, such as conservatives, while excluding liberal and far-right movements. Such a task can only be achieved with multiclass classifiers, provided that the desired orientation is incorporated within one of their predefined classes. To determine the most suitable foundation model among 13 candidate transformers for this task, we constructed four distinct corpora. One corpus comprised annotated plenary notes from the German Bundestag, while another was based on an official online decision-making tool, Wahl-O-Mat. The third corpus consisted of articles from 33 newspapers, each identified by its political orientation, and the fourth included 535,200 tweets from 597 members of the 20th and 21st German Bundestag. To mitigate overfitting, we used two distinct corpora for training and two for testing, respectively. For in-domain performance, DeBERTa-large achieved the highest F1 score F1=0.844 as well as for the X (Twitter) out-of-domain test ACC=0.864. Regarding the newspaper out-of-domain test, Gemma2-2B excelled (MAE = 0.172). This study demonstrates that transformer models can recognize political framing in German news at the level of public opinion polls. Our findings suggest that both the model architecture and the availability of domain-specific training data can be as influential as model size for estimating political bias. We discuss methodological limitations and outline directions for improving the robustness of bias measurement.

38.0CLMay 14
LLM-based Detection of Manipulative Political Narratives

Sinclair Schneider, Florian Steuber, Gabi Dreo Rodosek

We present a new computational framework for detecting and structuring manipulative political narratives. A task that became more important due to the shift of political discussions to social media. One of the primary challenges thereby is differentiating between manipulative political narratives and legitimate critiques. Some posts may also reframe actual events within a manipulative context. To achieve good clustering results, we filter manipulative posts beforehand using a detailed few-shot prompt that combines documented campaign narratives with legitimate criticisms to differentiate them. This prompt enables a reasoning model to assign labels, retaining only manipulative narrative posts for further processing. The remaining posts are subsequently embedded and dimensionality-reduced using UMAP, before HDBSCAN is applied to uncover narrative groups. A key advantage of this unsupervised approach is its independence from a predefined list of target categories, enabling it to uncover new narrative clusters. Finally, a reasoning model is employed to uncover the narrative behind each cluster. This approach, applied to over 1.2 million social media posts, effectively identified 41 distinct manipulative narrative clusters by integrating prompt-based filtering with unsupervised clustering.

7.9CRMay 8
GRASP -- Graph-Based Anomaly Detection Through Self-Supervised Classification

Robin Buchta, Carsten Kleiner, Felix Heine et al.

Advanced persistent threat (APT) attacks remain difficult to detect due to their stealth, adaptability, and use of legitimate system components. Provenance-based intrusion detection systems (PIDS) offer a promising defense by capturing detailed relationships between system components and actions. However, current PIDS rely on predefined or subset-determined thresholds, which limit detection stability and the ability to detect any anomalous behavior in general. Furthermore, related work often neglects the role of process executables, which describe system activity by interacting through a process with files, network components, and other processes. We introduce GRASP, a PIDS based on masked self-supervised classification. GRASP masks the executable information of processes and learns to infer it from their two-hop provenance graph neighborhood, marking misclassified processes as anomalies. It captures behavior patterns for the learned executables without thresholding, making it robust against interference and unknown activities. Evaluations on the DARPA TC and OpTC datasets demonstrate that GRASP consistently detects anomalous behavior, including known attack-related activities, outperforming existing systems. Our PIDS identifies all documented attacks on datasets where the behavior of executables is learnable. In addition, compared to existing systems, GRASP uncovers potentially malicious anomalous behavior not labeled as an attack in the documentation.

CLMar 10, 2025
Detection Avoidance Techniques for Large Language Models

Sinclair Schneider, Florian Steuber, Joao A. G. Schneider et al.

The increasing popularity of large language models has not only led to widespread use but has also brought various risks, including the potential for systematically spreading fake news. Consequently, the development of classification systems such as DetectGPT has become vital. These detectors are vulnerable to evasion techniques, as demonstrated in an experimental series: Systematic changes of the generative models' temperature proofed shallow learning-detectors to be the least reliable. Fine-tuning the generative model via reinforcement learning circumvented BERT-based-detectors. Finally, rephrasing led to a >90\% evasion of zero-shot-detectors like DetectGPT, although texts stayed highly similar to the original. A comparison with existing work highlights the better performance of the presented methods. Possible implications for society and further research are discussed.

CRFeb 12, 2021
Realizable Universal Adversarial Perturbations for Malware

Raphael Labaca-Castro, Luis Muñoz-González, Feargus Pendlebury et al.

Machine learning classifiers are vulnerable to adversarial examples -- input-specific perturbations that manipulate models' output. Universal Adversarial Perturbations (UAPs), which identify noisy patterns that generalize across the input space, allow the attacker to greatly scale up the generation of such examples. Although UAPs have been explored in application domains beyond computer vision, little is known about their properties and implications in the specific context of realizable attacks, such as malware, where attackers must satisfy challenging problem-space constraints. In this paper we explore the challenges and strengths of UAPs in the context of malware classification. We generate sequences of problem-space transformations that induce UAPs in the corresponding feature-space embedding and evaluate their effectiveness across different malware domains. Additionally, we propose adversarial training-based mitigations using knowledge derived from the problem-space transformations, and compare against alternative feature-space defenses. Our experiments limit the effectiveness of a white box Android evasion attack to ~20% at the cost of ~3% TPR at 1% FPR. We additionally show how our method can be adapted to more restrictive domains such as Windows malware. We observe that while adversarial training in the feature space must deal with large and often unconstrained regions, UAPs in the problem space identify specific vulnerabilities that allow us to harden a classifier more effectively, shifting the challenges and associated cost of identifying new universal adversarial transformations back to the attacker.

NIApr 20, 2020
Tracemax: A Novel Single Packet IP Traceback Strategy for Data-Flow Analysis

Peter Hillmann, Frank Tietze, Gabi Dreo Rodosek

The identification of the exact path that packets are routed on in the network is quite a challenge. This paper presents a novel, efficient traceback strategy named Tracemax in context of a defense system against distributed denial of service (DDoS) attacks. A single packet can be directly traced over many more hops than the current existing techniques allow. In combination with a defense system it differentiates between multiple connections. It aims to letting non-malicious connections pass while bad ones get thwarted. The novel concept allows detailed analyses of the traffic and the transmission path through the network. The strategy can effectively reduce the effect of common bandwidth and resource consumption attacks, foster early warning and prevention as well as higher the availability of the network services for the wanted customers.

MAApr 20, 2020
A Novel Multi-Agent System for Complex Scheduling Problems

Peter Hillmann, Tobias Uhlig, Gabi Dreo Rodosek et al.

Complex scheduling problems require a large amount computation power and innovative solution methods. The objective of this paper is the conception and implementation of a multi-agent system that is applicable in various problem domains. Independent specialized agents handle small tasks, to reach a superordinate target. Effective coordination is therefore required to achieve productive cooperation. Role models and distributed artificial intelligence are employed to tackle the resulting challenges. We simulate a NP-hard scheduling problem to demonstrate the validity of our approach. In addition to the general agent based framework we propose new simulation-based optimization heuristics to given scheduling problems. Two of the described optimization algorithms are implemented using agents. This paper highlights the advantages of the agent-based approach, like the reduction in layout complexity, improved control of complicated systems, and extendability.