CLAIMar 10, 2025

DatawiseAgent: A Notebook-Centric LLM Agent Framework for Adaptive and Robust Data Science Automation

arXiv:2503.07044v27 citationsh-index: 8EMNLP
Originality Incremental advance
AI Analysis

This addresses the need for more adaptive and robust automation tools for data scientists, though it appears to be an incremental improvement over existing agent frameworks.

The paper tackles the problem of limited generalization and over-reliance on SOTA LLMs in data science automation by introducing DatawiseAgent, a notebook-centric LLM agent framework that achieves state-of-the-art performance across diverse scenarios and models, surpassing baselines like AutoGen and TaskWeaver.

Existing large language model (LLM) agents for automating data science show promise, but they remain constrained by narrow task scopes, limited generalization across tasks and models, and over-reliance on state-of-the-art (SOTA) LLMs. We introduce DatawiseAgent, a notebook-centric LLM agent framework for adaptive and robust data science automation. Inspired by how human data scientists work in computational notebooks, DatawiseAgent introduces a unified interaction representation and a multi-stage architecture based on finite-state transducers (FSTs). This design enables flexible long-horizon planning, progressive solution development, and robust recovery from execution failures. Extensive experiments across diverse data science scenarios and models show that DatawiseAgent consistently achieves SOTA performance by surpassing strong baselines such as AutoGen and TaskWeaver, demonstrating superior effectiveness and adaptability. Further evaluations reveal graceful performance degradation under weaker or smaller models, underscoring the robustness and scalability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes