AIDBApr 7

BDI-Kit Demo: A Toolkit for Programmable and Conversational Data Harmonization

arXiv:2604.0640528.8h-index: 1
AI Analysis

This addresses data integration bottlenecks for researchers and analysts, though it is incremental as it builds on existing harmonization methods with new interfaces.

The paper tackles data harmonization challenges by introducing BDI-Kit, a toolkit with a Python API for programmatic pipeline construction and an AI-assisted chat interface for natural language interaction, enabling iterative exploration and refinement of schema and value matches.

Data harmonization remains a major bottleneck for integrative analysis due to heterogeneity in schemas, value representations, and domain-specific conventions. BDI-Kit provides an extensible toolkit for schema and value matching. It exposes two complementary interfaces tailored to different user needs: a Python API enabling developers to construct harmonization pipelines programmatically, and an AI-assisted chat interface allowing domain experts to harmonize data through natural language dialogue. This demonstration showcases how users interact with BDI-Kit to iteratively explore, validate, and refine schema and value matches through a combination of automated matching, AI-assisted reasoning, and user-driven refinement. We present two scenarios: (i) using the Python API to programmatically compose primitives, examine intermediate outputs, and reuse transformations; and (ii) conversing with the AI assistant in natural language to access BDI-Kit's capabilities and iteratively refine outputs based on the assistant's suggestions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes