Enrico Ahlers

h-index11
2papers

2 Papers

LGFeb 11
Kill it with FIRE: On Leveraging Latent Space Directions for Runtime Backdoor Mitigation in Deep Neural Networks

Enrico Ahlers, Daniel Passon, Yannic Noller et al.

Machine learning models are increasingly present in our everyday lives; as a result, they become targets of adversarial attackers seeking to manipulate the systems we interact with. A well-known vulnerability is a backdoor introduced into a neural network by poisoned training data or a malicious training process. Backdoors can be used to induce unwanted behavior by including a certain trigger in the input. Existing mitigations filter training data, modify the model, or perform expensive input modifications on samples. If a vulnerable model has already been deployed, however, those strategies are either ineffective or inefficient. To address this gap, we propose our inference-time backdoor mitigation approach called FIRE (Feature-space Inference-time REpair). We hypothesize that a trigger induces structured and repeatable changes in the model's internal representation. We view the trigger as directions in the latent spaces between layers that can be applied in reverse to correct the inference mechanism. Therefore, we turn the backdoored model against itself by manipulating its latent representations and moving a poisoned sample's features along the backdoor directions to neutralize the trigger. Our evaluation shows that FIRE has low computational overhead and outperforms current runtime mitigations on image benchmarks across various attacks, datasets, and network architectures.

INS-DETFeb 21, 2025
Inverse Surrogate Model of a Soft X-Ray Spectrometer using Domain Adaptation

Enrico Ahlers, Peter Feuer-Forson, Gregor Hartmann et al.

In this study, we present a method to create a robust inverse surrogate model for a soft X-ray spectrometer. During a beamtime at an electron storage ring, such as BESSY II, instrumentation and beamlines are required to be correctly aligned and calibrated for optimal experimental conditions. In order to automate these processes, machine learning methods can be developed and implemented, but in many cases these methods require the use of an inverse model which maps the output of the experiment, such as a detector image, to the parameters of the device. Due to limited experimental data, such models are often trained with simulated data, which creates the challenge of compensating for the inherent differences between simulation and experiment. In order to close this gap, we demonstrate the application of data augmentation and adversarial domain adaptation techniques, with which we can predict absolute coordinates for the automated alignment of our spectrometer. Bridging the simulation-experiment gap with minimal real-world data opens new avenues for automated experimentation using machine learning in scientific instrumentation.