Nikolay Matyunin

CR
h-index7
4papers
43citations
Novelty65%
AI Score45

4 Papers

67.5CRMar 10
Amnesia: Adversarial Semantic Layer Specific Activation Steering in Large Language Models

Ali Raza, Gurang Gupta, Nikolay Matyunin et al.

Warning: This article includes red-teaming experiments, which contain examples of compromised LLM responses that may be offensive or upsetting. Large Language Models (LLMs) have the potential to create harmful content, such as generating sophisticated phishing emails and assisting in writing code of harmful computer viruses. Thus, it is crucial to ensure their safe and responsible response generation. To reduce the risk of generating harmful or irresponsible content, researchers have developed techniques such as reinforcement learning with human feedback to align LLM's outputs with human values and preferences. However, it is still undetermined whether such measures are sufficient to prevent LLMs from generating interesting responses. In this study, we propose Amnesia, a lightweight activation-space adversarial attack that manipulates internal transformer states to bypass existing safety mechanisms in open-weight LLMs. Through experimental analysis on state-of-the-art, open-weight LLMs, we demonstrate that our attack effectively circumvents existing safeguards, enabling the generation of harmful content without the need for any fine-tuning or additional training. Our experiments on benchmark datasets show that the proposed attack can induce various antisocial behaviors in LLMs. These findings highlight the urgent need for more robust security measures in open-weight LLMs and underscore the importance of continued research to prevent their potential misuse.

AIJan 15
LADFA: A Framework of Using Large Language Models and Retrieval-Augmented Generation for Personal Data Flow Analysis in Privacy Policies

Haiyue Yuan, Nikolay Matyunin, Ali Raza et al.

Privacy policies help inform people about organisations' personal data processing practices, covering different aspects such as data collection, data storage, and sharing of personal data with third parties. Privacy policies are often difficult for people to fully comprehend due to the lengthy and complex legal language used and inconsistent practices across different sectors and organisations. To help conduct automated and large-scale analyses of privacy policies, many researchers have studied applications of machine learning and natural language processing techniques, including large language models (LLMs). While a limited number of prior studies utilised LLMs for extracting personal data flows from privacy policies, our approach builds on this line of work by combining LLMs with retrieval-augmented generation (RAG) and a customised knowledge base derived from existing studies. This paper presents the development of LADFA, an end-to-end computational framework, which can process unstructured text in a given privacy policy, extract personal data flows and construct a personal data flow graph, and conduct analysis of the data flow graph to facilitate insight discovery. The framework consists of a pre-processor, an LLM-based processor, and a data flow post-processor. We demonstrated and validated the effectiveness and accuracy of the proposed approach by conducting a case study that involved examining ten selected privacy policies from the automotive industry. Moreover, it is worth noting that LADFA is designed to be flexible and customisable, making it suitable for a range of text-based analysis tasks beyond privacy policy analysis.

CRJun 26, 2021
Evaluation of Cache Attacks on Arm Processors and Secure Caches

Shuwen Deng, Nikolay Matyunin, Wenjie Xiong et al.

Timing-based side and covert channels in processor caches continue to be a threat to modern computers. This work shows for the first time a systematic, large-scale analysis of Arm devices and the detailed results of attacks the processors are vulnerable to. Compared to x86, Arm uses different architectures, microarchitectural implementations, cache replacement policies, etc., which affects how attacks can be launched, and how security testing for the vulnerabilities should be done. To evaluate security, this paper presents security benchmarks specifically developed for testing Arm processors and their caches. The benchmarks are themselves evaluated with sensitivity tests, which examine how sensitive the benchmarks are to having a correct configuration in the testing phase. Further, to evaluate a large number of devices, this work leverages a novel approach of using a cloud-based Arm device testbed for architectural and security research on timing channels and runs the benchmarks on 34 different physical devices. In parallel, there has been much interest in secure caches to defend the various attacks. Consequently, this paper also investigates secure cache architectures using the proposed benchmarks. Especially, this paper implements and evaluates the secure PL and RF caches, showing the security of PL and RF caches, but also uncovers new weaknesses.

CRJun 26, 2019
MagneticSpy: Exploiting Magnetometer in Mobile Devices for Website and Application Fingerprinting

Nikolay Matyunin, Yujue Wang, Tolga Arul et al.

Recent studies have shown that aggregate CPU usage and power consumption traces on smartphones can leak information about applications running on the system or websites visited. In response, access to such data has been blocked for mobile applications starting from Android 8. In this work, we explore a new source of side-channel leakage for this class of attacks. Our method is based on the fact that electromagnetic activity caused by mobile processors leads to noticeable disturbances in magnetic sensor measurements on mobile devices, with the amplitude being proportional to the CPU workload. Therefore, recorded sensor data can be analyzed to reveal information about ongoing activities. The attack works on a number of devices: we evaluated 80 models of modern smartphones and tablets and observed the reaction of the magnetometer to the CPU activity on 56 of them. On selected devices we were able to successfully identify which application has been opened (with up to 90% accuracy) or which web page has been loaded (up to 91% accuracy). The presented side channel poses a significant risk to end users' privacy, as the sensor data can be recorded from native apps or even from web pages without user permissions. Finally, we discuss possible countermeasures to prevent the presented information leakage.