SEAISep 16, 2025

Prompt2DAG: A Modular Methodology for LLM-Based Data Enrichment Pipeline Generation

arXiv:2509.13487v12 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses the need for democratizing data pipeline development by automating workflow generation, though it is incremental as it builds on existing LLM and template-based methods.

The paper tackled the problem of generating reliable data enrichment pipelines from natural language descriptions by introducing Prompt2DAG, a methodology that transforms these descriptions into executable Apache Airflow DAGs, with the Hybrid approach achieving a 78.5% success rate and outperforming other methods like LLM-only (66.2%) and Direct (29.2%).

Developing reliable data enrichment pipelines demands significant engineering expertise. We present Prompt2DAG, a methodology that transforms natural language descriptions into executable Apache Airflow DAGs. We evaluate four generation approaches -- Direct, LLM-only, Hybrid, and Template-based -- across 260 experiments using thirteen LLMs and five case studies to identify optimal strategies for production-grade automation. Performance is measured using a penalized scoring framework that combines reliability with code quality (SAT), structural integrity (DST), and executability (PCT). The Hybrid approach emerges as the optimal generative method, achieving a 78.5% success rate with robust quality scores (SAT: 6.79, DST: 7.67, PCT: 7.76). This significantly outperforms the LLM-only (66.2% success) and Direct (29.2% success) methods. Our findings show that reliability, not intrinsic code quality, is the primary differentiator. Cost-effectiveness analysis reveals the Hybrid method is over twice as efficient as Direct prompting per successful DAG. We conclude that a structured, hybrid approach is essential for balancing flexibility and reliability in automated workflow generation, offering a viable path to democratize data pipeline development.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes