CLAIMar 26

Natural-Language Agent Harnesses

arXiv:2603.2572396.017 citationsh-index: 2
AI Analysis

This addresses the issue of harness design for AI agent developers, but it appears incremental as it builds on existing agent frameworks.

The paper tackles the problem of agent harness engineering being hard to transfer and compare by introducing Natural-Language Agent Harnesses (NLAHs) and Intelligent Harness Runtime (IHR), resulting in controlled evaluations across coding and computer-use benchmarks.

Agent performance increasingly depends on \emph{harness engineering}, yet harness design is usually buried in controller code and runtime-specific conventions, making it hard to transfer, compare, and study as a scientific object. We ask whether the high-level control logic of an agent harness can instead be externalized as a portable executable artifact. We introduce \textbf{Natural-Language Agent Harnesses} (NLAHs), which express harness behavior in editable natural language, and \textbf{Intelligent Harness Runtime} (IHR), a shared runtime that executes these harnesses through explicit contracts, durable artifacts, and lightweight adapters. Across coding and computer-use benchmarks, we conduct controlled evaluations of operational viability, module ablation, and code-to-text harness migration.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes