CLMay 26

FinHarness: An Inline Lifecycle Safety Harness for Finance LLM Agents

arXiv:2605.2733382.3
Predicted impact top 62% in CL · last 90 daysOriginality Incremental advance
AI Analysis

Provides a practical safety mechanism for finance LLM agents to prevent irreversible harmful actions during multi-step workflows.

FinHarness reduces attack success rate from 38.3% to 15.0% in finance LLM agents while preserving benign approval (41.1% to 39.3%) and using 4.7× fewer advanced-judge calls.

Finance LLM agents must simultaneously block prompt-induced unauthorized actions and approve legitimate multi-step business workflows. However, boundary filters often miss irreversible mid-trajectory tool calls, while post-hoc LLM judges perform auditing only after termination -- too late for intervention and at a computational cost that scales linearly with trace length. We present FinHarness, an inline safety harness that wraps a finance agent end-to-end with three components: a Query Monitor that fuses single-turn intent with cross-turn drift, a Tool Monitor that evaluates each prospective tool call, and a Cascade module that integrates per-step risk and adaptively routes verification between a lightweight and an advanced-tier LLM judge. Fired risk factors are re-injected into the agent input as ex-ante evidence, enabling the agent to refuse, re-plan, or approve on its own. On FinVault, routed FinHarness cuts ASR from 38.3% to 15.0% while largely preserving benign approval ($41.1\% \to 39.3\%$), and uses $4.7\times$ fewer advanced-judge calls than an always-advanced ablation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes