Dmitry Namiot

CR
h-index5
6papers
38citations
Novelty33%
AI Score39

6 Papers

CRMay 13Code
Sleeper Channels and Provenance Gates: Persistent Prompt Injection in Always-on Autonomous AI Agents

Narek Maloyan, Dmitry Namiot

Always-on AI agents (OpenClaw, Hermes Agent) run as a single persistent process under the owner's identity, folding messaging, memory, self-authored skills, scheduling, and shell into one authority boundary. This configuration opens what we call \emph{sleeper channels}: an untrusted input to one surface persists as a memory, skill, scheduled job, or filesystem patch, then fires later through a different surface with no attacker present. Two independent axes define the class: persistence substrate and firing-separation. We walk a confused-deputy cron attack end-to-end through OpenClaw at a pinned commit. The defense is tiered (D1, D2, D3), and D2 carries a soundness theorem against seven named deployment invariants. D2 keys on a canonical action-instance digest with one-shot owner attestations, defeating paraphrase laundering, multi-input grant reuse, and replay. A companion artifact ships the gate, a static audit over the vendored source, and a runtime adapter realising five of the ten mediation hooks (H1, H2, H3, H6, H9) around the cron path (42 tests, Node~$\geq{}20$, at \href{https://github.com/maloyan/sleeper-channels}{github.com/maloyan/sleeper-channels}). Empirical evaluation is preregistered as follow-on.

CLMay 19, 2025Code
Investigating the Vulnerability of LLM-as-a-Judge Architectures to Prompt-Injection Attacks

Narek Maloyan, Bislan Ashinov, Dmitry Namiot

Large Language Models (LLMs) are increasingly employed as evaluators (LLM-as-a-Judge) for assessing the quality of machine-generated text. This paradigm offers scalability and cost-effectiveness compared to human annotation. However, the reliability and security of such systems, particularly their robustness against adversarial manipulations, remain critical concerns. This paper investigates the vulnerability of LLM-as-a-Judge architectures to prompt-injection attacks, where malicious inputs are designed to compromise the judge's decision-making process. We formalize two primary attack strategies: Comparative Undermining Attack (CUA), which directly targets the final decision output, and Justification Manipulation Attack (JMA), which aims to alter the model's generated reasoning. Using the Greedy Coordinate Gradient (GCG) optimization method, we craft adversarial suffixes appended to one of the responses being compared. Experiments conducted on the MT-Bench Human Judgments dataset with open-source instruction-tuned LLMs (Qwen2.5-3B-Instruct and Falcon3-3B-Instruct) demonstrate significant susceptibility. The CUA achieves an Attack Success Rate (ASR) exceeding 30\%, while JMA also shows notable effectiveness. These findings highlight substantial vulnerabilities in current LLM-as-a-Judge systems, underscoring the need for robust defense mechanisms and further research into adversarial evaluation and trustworthiness in LLM-based assessment frameworks.

CRApr 25, 2025
Adversarial Attacks on LLM-as-a-Judge Systems: Insights from Prompt Injections

Narek Maloyan, Dmitry Namiot

LLM as judge systems used to assess text quality code correctness and argument strength are vulnerable to prompt injection attacks. We introduce a framework that separates content author attacks from system prompt attacks and evaluate five models Gemma 3.27B Gemma 3.4B Llama 3.2 3B GPT 4 and Claude 3 Opus on four tasks with various defenses using fifty prompts per condition. Attacks achieved up to seventy three point eight percent success smaller models proved more vulnerable and transferability ranged from fifty point five to sixty two point six percent. Our results contrast with Universal Prompt Injection and AdvPrompter We recommend multi model committees and comparative scoring and release all code and datasets

NIOct 16, 2015
The Physical Web in Smart Cities

Dmitry Namiot, Manfred Sneps-Sneppe

In this paper, we discuss the physical web projects based on network proximity for Smart Cities. In general, the Physical Web is an approach for connecting any physical object to the web. With this approach, we can navigate and control physical objects in the world surrounding mobile devices. Alternatively, we can execute services on mobile devices, depending on the surrounding physical objects. Technically, there are different ways to enumerate physical objects. In this paper, we will target the models based on the wireless proximity.

CYJun 7, 2015
On Network Proximity in Web Applications

Dmitry Namiot, Manfred Sneps-Sneppe

In this paper, we discuss one approach for development and deployment of web sites (web pages) devoted to the description of objects (events) with a precisely delineated geographic scope. This article describes the usage of context-aware programming models for web development. In our paper, we propose mechanisms to create mobile web applications which content links to some predefined geographic area. The accuracy of such a binding allows us to distinguish individual areas within the same indoor space. Target areas for such development are applications for Smart Cities and retail.

NIMar 23, 2015
Metadata in SDN API

Dmitry Namiot, Manfred Sneps-Sneppe

This paper discusses the system aspects of development of applied programming interfaces in Software-Defined Networking (SDN). Almost all existing SDN interfaces use so-called Representational State Transfer (REST) services as a basic model. This model is simple and straightforward for developers, but often does not support the information (metadata) necessary for programming automation. In this article, we cover the issues of representation of metadata in the SDN API.