CL AIMar 10, 2025

DatawiseAgent: A Notebook-Centric LLM Agent Framework for Adaptive and Robust Data Science Automation

Ziming You, Yumiao Zhang, Dexuan Xu, Yiwei Lou, Yandong Yan, Wei Wang, Huaming Zhang, Yu Huang

arXiv:2503.07044v215.59 citationsh-index: 8EMNLP

Originality Incremental advance

AI Analysis

This addresses the need for more adaptive and robust automation tools for data scientists, though it appears to be an incremental improvement over existing agent frameworks.

The paper tackles the problem of limited generalization and over-reliance on SOTA LLMs in data science automation by introducing DatawiseAgent, a notebook-centric LLM agent framework that achieves state-of-the-art performance across diverse scenarios and models, surpassing baselines like AutoGen and TaskWeaver.

Existing large language model (LLM) agents for automating data science show promise, but they remain constrained by narrow task scopes, limited generalization across tasks and models, and over-reliance on state-of-the-art (SOTA) LLMs. We introduce DatawiseAgent, a notebook-centric LLM agent framework for adaptive and robust data science automation. Inspired by how human data scientists work in computational notebooks, DatawiseAgent introduces a unified interaction representation and a multi-stage architecture based on finite-state transducers (FSTs). This design enables flexible long-horizon planning, progressive solution development, and robust recovery from execution failures. Extensive experiments across diverse data science scenarios and models show that DatawiseAgent consistently achieves SOTA performance by surpassing strong baselines such as AutoGen and TaskWeaver, demonstrating superior effectiveness and adaptability. Further evaluations reveal graceful performance degradation under weaker or smaller models, underscoring the robustness and scalability.

View on arXiv PDF

Similar