Pneuma-Seeker: A Relational Reification Mechanism to Align AI Agents with Human Work over Relational Data
This addresses the issue for data workers who struggle to articulate precise information needs, though it appears incremental as it builds on existing LLM and agentic methods.
The paper tackles the problem of AI agents misinterpreting under-specified natural-language requests over relational data by introducing Pneuma-Seeker, a system that uses relational reification to iteratively refine a shared schema and construct executable programs, resulting in higher answer accuracy compared to state-of-the-art baselines.
When faced with data problems, many data workers cannot articulate their information need precisely enough for software to help. Although LLMs interpret natural-language requests, they behave brittly when intent is under-specified, e.g., hallucinating fields, assuming join paths, or producing ungrounded answers. We present Pneuma-Seeker, a system built around a central idea: relational reification. Pneuma-Seeker represents a user's evolving information need as a relational schema: a concrete, analysis-ready data model shared between user and system. Rather than answering prompts directly, Pneuma-Seeker iteratively refines this schema, then discovers and prepares relevant sources to construct a relation and executable program that compute the answer. Pneuma-Seeker employs an LLM-powered agentic architecture with conductor-style planning and macro- and micro-level context management to operate effectively over heterogeneous relational corpora. We evaluate Pneuma-Seeker across multiple domains against state-of-the-art academic and industrial baselines, demonstrating higher answer accuracy. Deployment in a real organization highlights trust and inspectability as essential requirements for LLM-mediated data systems.