AICLDec 18, 2025

QuadSentinel: Sequent Safety for Machine-Checkable Control in Multi-agent Systems

arXiv:2512.16279v12 citationsh-index: 7Has Code
Originality Incremental advance
AI Analysis

This addresses safety control for deployers of large language model-based agents in complex tasks, though it appears incremental as it builds on existing guardrail methods.

The paper tackles the problem of safety risks in multi-agent systems by proposing QuadSentinel, a four-agent guard that compiles safety policies into machine-checkable rules and enforces them online, improving guardrail accuracy and rule recall while reducing false positives on benchmarks like ST-WebAgentBench and AgentHarm.

Safety risks arise as large language model-based agents solve complex tasks with tools, multi-step plans, and inter-agent messages. However, deployer-written policies in natural language are ambiguous and context dependent, so they map poorly to machine-checkable rules, and runtime enforcement is unreliable. Expressing safety policies as sequents, we propose \textsc{QuadSentinel}, a four-agent guard (state tracker, policy verifier, threat watcher, and referee) that compiles these policies into machine-checkable rules built from predicates over observable state and enforces them online. Referee logic plus an efficient top-$k$ predicate updater keeps costs low by prioritizing checks and resolving conflicts hierarchically. Measured on ST-WebAgentBench (ICML CUA~'25) and AgentHarm (ICLR~'25), \textsc{QuadSentinel} improves guardrail accuracy and rule recall while reducing false positives. Against single-agent baselines such as ShieldAgent (ICML~'25), it yields better overall safety control. Near-term deployments can adopt this pattern without modifying core agents by keeping policies separate and machine-checkable. Our code will be made publicly available at https://github.com/yyiliu/QuadSentinel.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes