AIJun 4
LLM Self-Recognition: Steering and Retrieving Activation SignaturesThibaud Ardoin, Jonas Schäfer, Gerhard Wunder
Recent advances in interpretability suggest that large language models (LLMs) implicitly encode signals in their generated text that enable self-recognition of their outputs. We demonstrate that this capability is reliable, even in low-entropy scenarios, and that it can be amplified through targeted intervention. By steering the internal residual stream during generation with a random sparse vector, we create a detectable fingerprint that enables attribution of a given text to a specific LLM. This signal is recoverable from the activations of an LLM used as a detector, achieving over 98% accuracy across multiple detection settings while preserving the quality of generated text. As AI-generated content proliferates, this approach offers a practical alternative to traditional detectors by leveraging the model's natural representation structure for attribution rather than embedding a signal externally. Our contributions include: (i) establishing reliable self-recognition capabilities in LLMs, (ii) a simple steering mechanism enabling multi-LLM identification with no quality degradation, (iii) demonstrating that activation spaces contain exploitable structure for encoding signals without semantic interference.
CLSep 7, 2022
Power of Explanations: Towards automatic debiasing in hate speech detectionYi Cai, Arthur Zimek, Gerhard Wunder et al.
Hate speech detection is a common downstream application of natural language processing (NLP) in the real world. In spite of the increasing accuracy, current data-driven approaches could easily learn biases from the imbalanced data distributions originating from humans. The deployment of biased models could further enhance the existing social biases. But unlike handling tabular data, defining and mitigating biases in text classifiers, which deal with unstructured data, are more challenging. A popular solution for improving machine learning fairness in NLP is to conduct the debiasing process with a list of potentially discriminated words given by human annotators. In addition to suffering from the risks of overlooking the biased terms, exhaustively identifying bias with human annotators are unsustainable since discrimination is variable among different datasets and may evolve over time. To this end, we propose an automatic misuse detector (MiD) relying on an explanation method for detecting potential bias. And built upon that, an end-to-end debiasing framework with the proposed staged correction is designed for text classifiers without any external resources required.
CLFeb 11, 2023
Explaining text classifiers through progressive neighborhood approximation with realistic samplesYi Cai, Arthur Zimek, Eirini Ntoutsi et al.
The importance of neighborhood construction in local explanation methods has been already highlighted in the literature. And several attempts have been made to improve neighborhood quality for high-dimensional data, for example, texts, by adopting generative models. Although the generators produce more realistic samples, the intuitive sampling approaches in the existing solutions leave the latent space underexplored. To overcome this problem, our work, focusing on local model-agnostic explanations for text classifiers, proposes a progressive approximation approach that refines the neighborhood of a to-be-explained decision with a careful two-stage interpolation using counterfactuals as landmarks. We explicitly specify the two properties that should be satisfied by generative models, the reconstruction ability and the locality-preserving property, to guide the selection of generators for local explanation methods. Moreover, noticing the opacity of generative models during the study, we propose another method that implements progressive neighborhood approximation with probability-based editions as an alternative to the generator-based solution. The explanation results from both methods consist of word-level and instance-level explanations benefiting from the realistic neighborhood. Through exhaustive experiments, we qualitatively and quantitatively demonstrate the effectiveness of the two proposed methods.
LGApr 22, 2023
Differentially Private Synthetic Data Generation via Lipschitz-Regularised Variational AutoencodersBenedikt Groß, Gerhard Wunder
Synthetic data has been hailed as the silver bullet for privacy preserving data analysis. If a record is not real, then how could it violate a person's privacy? In addition, deep-learning based generative models are employed successfully to approximate complex high-dimensional distributions from data and draw realistic samples from this learned distribution. It is often overlooked though that generative models are prone to memorising many details of individual training records and often generate synthetic data that too closely resembles the underlying sensitive training data, hence violating strong privacy regulations as, e.g., encountered in health care. Differential privacy is the well-known state-of-the-art framework for guaranteeing protection of sensitive individuals' data, allowing aggregate statistics and even machine learning models to be released publicly without compromising privacy. The training mechanisms however often add too much noise during the training process, and thus severely compromise the utility of these private models. Even worse, the tight privacy budgets do not allow for many training epochs so that model quality cannot be properly controlled in practice. In this paper we explore an alternative approach for privately generating data that makes direct use of the inherent stochasticity in generative models, e.g., variational autoencoders. The main idea is to appropriately constrain the continuity modulus of the deep models instead of adding another noise mechanism on top. For this approach, we derive mathematically rigorous privacy guarantees and illustrate its effectiveness with practical experiments.
LGAug 18, 2023
On Gradient-like Explanation under a Black-box Setting: When Black-box Explanations Become as Good as White-boxYi Cai, Gerhard Wunder
Attribution methods shed light on the explainability of data-driven approaches such as deep learning models by uncovering the most influential features in a to-be-explained decision. While determining feature attributions via gradients delivers promising results, the internal access required for acquiring gradients can be impractical under safety concerns, thus limiting the applicability of gradient-based approaches. In response to such limited flexibility, this paper presents \methodAbr~(gradient-estimation-based explanation), an approach that produces gradient-like explanations through only query-level access. The proposed approach holds a set of fundamental properties for attribution methods, which are mathematically rigorously proved, ensuring the quality of its explanations. In addition to the theoretical analysis, with a focus on image data, the experimental results empirically demonstrate the superiority of the proposed method over state-of-the-art black-box methods and its competitive performance compared to methods with full access.
SYAug 14, 2012
Wireless Network Design Under Service ConstraintsMartin Kasparick, Gerhard Wunder
In this paper we consider the design of wireless queueing network control policies with special focus on application-dependent service constraints. In particular we consider streaming traffic induced requirements such as avoiding buffer underflows, which significantly complicate the control problem compared to guaranteeing throughput optimality only. Since state-of-the-art approaches for enforcing minimum buffer constraints in broadcast networks are not suitable for application in general networks we argue for a cost function based approach, which combines throughput optimality with flexibility regarding service constraints. New theoretical stability results are presented and various candidate cost functions are investigated concerning their suitability for use in wireless networks with streaming media traffic. Furthermore we show how the cost function based approach can be used to aid wireless network design with respect to important system parameters. The performance is demonstrated using numerical simulations.
LGNov 3, 2025Code
Machine and Deep Learning for Indoor UWB Jammer LocalizationHamed Fard, Mahsa Kholghi, Benedikt Groß et al.
Ultra-wideband (UWB) localization delivers centimeter-scale accuracy but is vulnerable to jamming attacks, creating security risks for asset tracking and intrusion detection in smart buildings. Although machine learning (ML) and deep learning (DL) methods have improved tag localization, localizing malicious jammers within a single room and across changing indoor layouts remains largely unexplored. Two novel UWB datasets, collected under original and modified room configurations, are introduced to establish comprehensive ML/DL baselines. Performance is rigorously evaluated using a variety of classification and regression metrics. On the source dataset with the collected UWB features, Random Forest achieves the highest F1-macro score of 0.95 and XGBoost achieves the lowest mean Euclidean error of 20.16 cm. However, deploying these source-trained models in the modified room layout led to severe performance degradation, with XGBoost's mean Euclidean error increasing tenfold to 207.99 cm, demonstrating significant domain shift. To mitigate this degradation, a domain-adversarial ConvNeXt autoencoder (A-CNT) is proposed that leverages a gradient-reversal layer to align CIR-derived features across domains. The A-CNT framework restores localization performance by reducing the mean Euclidean error to 34.67 cm. This represents a 77 percent improvement over non-adversarial transfer learning and an 83 percent improvement over the best baseline, restoring the fraction of samples within 30 cm to 0.56. Overall, the results demonstrate that adversarial feature alignment enables robust and transferable indoor jammer localization despite environmental changes. Code and dataset available at https://github.com/afbf4c8996f/Jammer-Loc
SYJan 9, 2013
Stability and Cost Optimization in Controlled Random Walks Using Scheduling FieldsGerhard Wunder, Chan Zhou, Martin Kasparick
The control of large queueing networks is a notoriously difficult problem. Recently, an interesting new policy design framework for the control problem called h-MaxWeight has been proposed: h-MaxWeight is a natural generalization of the famous MaxWeight policy where instead of the quadratic any other surrogate value function can be applied. Stability of the policy is then achieved through a perturbation technique. However, stability crucially depends on parameter choice which has to be adapted in simulations. In this paper we use a different technique where the required perturbations can be directly implemented in the weight domain, which we call a scheduling field then. Specifically, we derive the theoretical arsenal that guarantees universal stability while still operating close to the underlying cost criterion. Simulation examples suggest that the new approach to policy synthesis can even provide significantly higher gains irrespective of any further assumptions on the network model or parameter choice.
CRSep 27, 2025Code
An Investigation into the Performance of Non-Contrastive Self-Supervised Learning Methods for Network Intrusion DetectionHamed Fard, Tobias Schalau, Gerhard Wunder
Network intrusion detection, a well-explored cybersecurity field, has predominantly relied on supervised learning algorithms in the past two decades. However, their limitations in detecting only known anomalies prompt the exploration of alternative approaches. Motivated by the success of self-supervised learning in computer vision, there is a rising interest in adapting this paradigm for network intrusion detection. While prior research mainly delved into contrastive self-supervised methods, the efficacy of non-contrastive methods, in conjunction with encoder architectures serving as the representation learning backbone and augmentation strategies that determine what is learned, remains unclear for effective attack detection. This paper compares the performance of five non-contrastive self-supervised learning methods using three encoder architectures and six augmentation strategies. Ninety experiments are systematically conducted on two network intrusion detection datasets, UNSW-NB15 and 5G-NIDD. For each self-supervised model, the combination of encoder architecture and augmentation method yielding the highest average precision, recall, F1-score, and AUCROC is reported. Furthermore, by comparing the best-performing models to two unsupervised baselines, DeepSVDD, and an Autoencoder, we showcase the competitiveness of the non-contrastive methods for attack detection. Code at: https://github.com/renje4z335jh4/non_contrastive_SSL_NIDS
LGDec 15, 2025
ALIGN-FL: Architecture-independent Learning through Invariant Generative component sharing in Federated LearningMayank Gulati, Benedikt Groß, Gerhard Wunder
We present ALIGN-FL, a novel approach to distributed learning that addresses the challenge of learning from highly disjoint data distributions through selective sharing of generative components. Instead of exchanging full model parameters, our framework enables privacy-preserving learning by transferring only generative capabilities across clients, while the server performs global training using synthetic samples. Through complementary privacy mechanisms: DP-SGD with adaptive clipping and Lipschitz regularized VAE decoders and a stateful architecture supporting heterogeneous clients, we experimentally validate our approach on MNIST and Fashion-MNIST datasets with cross-domain outliers. Our analysis demonstrates that both privacy mechanisms effectively map sensitive outliers to typical data points while maintaining utility in extreme Non-IID scenarios typical of cross-silo collaborations. Index Terms: Client-invariant Learning, Federated Learning (FL), Privacy-preserving Generative Models, Non-Independent and Identically Distributed (Non-IID), Heterogeneous Architectures
LGNov 11, 2025
Rethinking Explanation Evaluation under the Retraining SchemeYi Cai, Thibaud Ardoin, Mayank Gulati et al.
Feature attribution has gained prominence as a tool for explaining model decisions, yet evaluating explanation quality remains challenging due to the absence of ground-truth explanations. To circumvent this, explanation-guided input manipulation has emerged as an indirect evaluation strategy, measuring explanation effectiveness through the impact of input modifications on model outcomes during inference. Despite the widespread use, a major concern with inference-based schemes is the distribution shift caused by such manipulations, which undermines the reliability of their assessments. The retraining-based scheme ROAR overcomes this issue by adapting the model to the altered data distribution. However, its evaluation results often contradict the theoretical foundations of widely accepted explainers. This work investigates this misalignment between empirical observations and theoretical expectations. In particular, we identify the sign issue as a key factor responsible for residual information that ultimately distorts retraining-based evaluation. Based on the analysis, we show that a straightforward reframing of the evaluation process can effectively resolve the identified issue. Building on the existing framework, we further propose novel variants that jointly structure a comprehensive perspective on explanation evaluation. These variants largely improve evaluation efficiency over the standard retraining protocol, thereby enhancing practical applicability for explainer selection and benchmarking. Following our proposed schemes, empirical results across various data scales provide deeper insights into the performance of carefully selected explainers, revealing open challenges and future directions in explainability research.
LGJan 8, 2025
Tracking UWB Devices Through Radio Frequency Fingerprinting Is PossibleThibaud Ardoin, Niklas Pauli, Benedikt Groß et al.
Ultra-wideband (UWB) is a state-of-the-art technology designed for applications requiring centimeter-level localization. Its widespread adoption by smartphone manufacturer naturally raises security and privacy concerns. Successfully implementing Radio Frequency Fingerprinting (RFF) to UWB could enable physical layer security, but might also allow undesired tracking of the devices. The scope of this paper is to explore the feasibility of applying RFF to UWB and investigates how well this technique generalizes across different environments. We collected a realistic dataset using off-the-shelf UWB devices with controlled variation in device positioning. Moreover, we developed an improved deep learning pipeline to extract the hardware signature from the signal data. In stable conditions, the extracted RFF achieves over 99% accuracy. While the accuracy decreases in more changing environments, we still obtain up to 76% accuracy in untrained locations.
CLNov 25, 2024
Transparent Neighborhood Approximation for Text Classifier ExplanationYi Cai, Arthur Zimek, Eirini Ntoutsi et al.
Recent literature highlights the critical role of neighborhood construction in deriving model-agnostic explanations, with a growing trend toward deploying generative models to improve synthetic instance quality, especially for explaining text classifiers. These approaches overcome the challenges in neighborhood construction posed by the unstructured nature of texts, thereby improving the quality of explanations. However, the deployed generators are usually implemented via neural networks and lack inherent explainability, sparking arguments over the transparency of the explanation process itself. To address this limitation while preserving neighborhood quality, this paper introduces a probability-based editing method as an alternative to black-box text generators. This approach generates neighboring texts by implementing manipulations based on in-text contexts. Substituting the generator-based construction process with recursive probability-based editing, the resultant explanation method, XPROB (explainer with probability-based editing), exhibits competitive performance according to the evaluation conducted on two real-world datasets. Additionally, XPROB's fully transparent and more controllable construction process leads to superior stability compared to the generator-based explainers.
ITNov 12, 2021
A Reverse Jensen Inequality Result with Application to Mutual Information EstimationGerhard Wunder, Benedikt Groß, Rick Fritschek et al.
The Jensen inequality is a widely used tool in a multitude of fields, such as for example information theory and machine learning. It can be also used to derive other standard inequalities such as the inequality of arithmetic and geometric means or the Hölder inequality. In a probabilistic setting, the Jensen inequality describes the relationship between a convex function and the expected value. In this work, we want to look at the probabilistic setting from the reverse direction of the inequality. We show that under minimal constraints and with a proper scaling, the Jensen inequality can be reversed. We believe that the resulting tool can be helpful for many applications and provide a variational estimation of mutual information, where the reverse inequality leads to a new estimator with superior training behavior compared to current estimators.
ITJun 1, 2021
Reinforce Security: A Model-Free Approach Towards Secure Wiretap CodingRick Fritschek, Rafael F. Schaefer, Gerhard Wunder
The use of deep learning-based techniques for approximating secure encoding functions has attracted considerable interest in wireless communications due to impressive results obtained for general coding and decoding tasks for wireless communication systems. Of particular importance is the development of model-free techniques that work without knowledge about the underlying channel. Such techniques utilize for example generative adversarial networks to estimate and model the conditional channel distribution, mutual information estimation as a reward function, or reinforcement learning. In this paper, the approach of reinforcement learning is studied and, in particular, the policy gradient method for a model-free approach of neural network-based secure encoding is investigated. Previously developed techniques for enforcing a certain co-set structure on the encoding process can be combined with recent reinforcement learning approaches. This new approach is evaluated by extensive simulations, and it is demonstrated that the resulting decoding performance of an eavesdropper is capped at a certain error level.
CRMar 11, 2021
ComPass: Proximity Aware Common Passphrase Agreement Protocol for Wi-Fi devices Using Physical Layer SecurityKhan Reaz, Gerhard Wunder
Secure and scalable device provisioning is a notorious challenge in Wi-Fi. WPA2/WPA3 solutions take user interaction and a strong passphrase for granted. However, the often weak passphrases are subject to guessing attacks. Notably, there has been a significant rise of cyberattacks on Wi-Fi home or small office networks during the COVID-19 pandemic. This paper addresses the device provisioning problem in Wi-Fi (personal mode) and proposes ComPass protocol to supplement WPA2/WPA3. ComPass replaces the pre-installed or user-selected passphrases with automatically generated ones. For this, ComPass employs Physical Layer Security and extracts credentials from common random physical layer parameters between devices. Two major features make ComPass unique and superior compared to previous proposals: First, it employs phase information (rather than amplitude or signal strength) to generate the passphrase so that it is robust, scaleable, and impossible to guess. Our analysis showed that ComPass generated passphrases have 3 times more entropy than human generated passphrases (113-bits vs. 34-bits). Second, ComPass selects parameters such that two devices bind only within a certain proximity (less than 3m), hence providing practically useful in-build PLS-based authentiation. ComPass is available as a kernel module or as full firmware.
ITMar 7, 2019
Deep Learning for Channel Coding via Neural Mutual Information EstimationRick Fritschek, Rafael F. Schaefer, Gerhard Wunder
End-to-end deep learning for communication systems, i.e., systems whose encoder and decoder are learned, has attracted significant interest recently, due to its performance which comes close to well-developed classical encoder-decoder designs. However, one of the drawbacks of current learning approaches is that a differentiable channel model is needed for the training of the underlying neural networks. In real-world scenarios, such a channel model is hardly available and often the channel density is not even known at all. Some works, therefore, focus on a generative approach, i.e., generating the channel from samples, or rely on reinforcement learning to circumvent this problem. We present a novel approach which utilizes a recently proposed neural estimator of mutual information. We use this estimator to optimize the encoder for a maximized mutual information, only relying on channel samples. Moreover, we show that our approach achieves the same performance as state-of-the-art end-to-end learning with perfect channel model knowledge.
SYSep 14, 2018
Distributed MPC with Prediction of Time-Varying Communication DelayJannik Hahn, Richard Schoeffauer, Gerhard Wunder et al.
The novel idea presented in this paper is to interweave distributed model predictive control with a reliable scheduling of the information that is interchanged between local controllers of the plant subsystems. To this end, a dynamic model of the communication network and a predictive scheduling algorithm are proposed, the latter providing predictions of the delay between sending and receiving information. These predictions can be used by the local subsystem controllers to improve their control performance, as exemplary shown for a platooning example.
ITMay 9, 2017
Compressive Estimation of a Stochastic Process with Unknown Autocorrelation FunctionMahdi Barzegar Khalilsarai, Saeid Haghighatshoar, Giuseppe Caire et al.
In this paper, we study the prediction of a circularly symmetric zero-mean stationary Gaussian process from a window of observations consisting of finitely many samples. This is a prevalent problem in a wide range of applications in communication theory and signal processing. Due to stationarity, when the autocorrelation function or equivalently the power spectral density (PSD) of the process is available, the Minimum Mean Squared Error (MMSE) predictor is readily obtained. In particular, it is given by a linear operator that depends on autocorrelation of the process as well as the noise power in the observed samples. The prediction becomes, however, quite challenging when the PSD of the process is unknown. In this paper, we propose a blind predictor that does not require the a priori knowledge of the PSD of the process and compare its performance with that of an MMSE predictor that has a full knowledge of the PSD. To design such a blind predictor, we use the random spectral representation of a stationary Gaussian process. We apply the well-known atomic-norm minimization technique to the observed samples to obtain a discrete quantization of the underlying random spectrum, which we use to predict the process. Our simulation results show that this estimator has a good performance comparable with that of the MMSE estimator.