LGMay 10

Kintsugi: Learning Policies by Repairing Executable Knowledge Bases

arXiv:2605.0948784.9
Predicted impact top 11% in LG · last 90 daysOriginality Highly original
AI Analysis

For embodied AI researchers, Kintsugi offers a verifiable, inspectable alternative to black-box neural policies, enabling localized debugging and reuse of task knowledge.

Kintsugi introduces a white-box policy-learning framework that represents task-level knowledge as an executable knowledge base (KB) and improves it via localized typed edits from rollout evidence, achieving strong performance on long-horizon text-agent and object-centric manipulation tasks while preserving inspectability and editability.

Modern embodied agents achieve impressive performance, but their task knowledge is often stored in neural weights, latent state, or prompt-bound memory, making individual policy knowledge difficult to inspect, validate, recombine, and reuse. We introduce \textbf{Kintsugi}, a white-box policy-learning framework that treats embodied policy improvement as verifier-gated construction of a typed executable Knowledge Base (KB). Kintsugi represents task-level policy knowledge as composable typed entries -- predicates, operators, policy schemas, monitors, recovery rules, experience records, and goals -- and improves this artifact through localized typed edits induced from rollout evidence, rather than relying on test-time language-model reasoning. Between rollouts, a tool-constrained agentic editing loop diagnoses trajectory failures, localizes them to editable KB layers, and proposes candidate edits. A deterministic verification gate admits an edit only when the candidate type-checks, the resulting KB executes, and focused validation success or trajectory-health metrics improve without violating protected-regression checks. At inference, the accepted KB is executed by a deterministic symbolic executor with zero LLM calls. Across long-horizon text-agent benchmarks and representative object-centric manipulation settings, Kintsugi achieves strong endpoint performance while preserving inspectability, local editability, and verifier-gated deployment. These results suggest that embodied policy improvement can be organized around executable task knowledge.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes