AIDBOct 10, 2025

Safe, Untrusted, "Proof-Carrying" AI Agents: toward the agentic lakehouse

arXiv:2510.09567v15 citationsh-index: 11
Originality Synthesis-oriented
AI Analysis

This addresses security and governance concerns for organizations using AI-driven automation in data lakehouses, though it appears incremental as it builds on existing concepts like proof-carrying code.

The paper tackles the problem of ensuring trust and safety when using AI agents in sensitive data lakehouse environments by proposing a safe-by-design approach using data branching and declarative environments, with a proof-of-concept showing that untrusted AI agents can operate safely on production data.

Data lakehouses run sensitive workloads, where AI-driven automation raises concerns about trust, correctness, and governance. We argue that API-first, programmable lakehouses provide the right abstractions for safe-by-design, agentic workflows. Using Bauplan as a case study, we show how data branching and declarative environments extend naturally to agents, enabling reproducibility and observability while reducing the attack surface. We present a proof-of-concept in which agents repair data pipelines using correctness checks inspired by proof-carrying code. Our prototype demonstrates that untrusted AI agents can operate safely on production data and outlines a path toward a fully agentic lakehouse.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes