CLMay 24, 2023

Is GPT-4 a Good Data Analyst?

arXiv:2305.15038v2154 citations
Originality Synthesis-oriented
AI Analysis

This addresses concerns about AI replacing data analysts by providing empirical evidence, though it is incremental as it builds on existing LLM capabilities.

The study investigated whether GPT-4 can perform as a data analyst by conducting head-to-head comparisons with human professionals across various domains, finding that GPT-4 achieved comparable performance to humans.

As large language models (LLMs) have demonstrated their powerful capabilities in plenty of domains and tasks, including context understanding, code generation, language generation, data storytelling, etc., many data analysts may raise concerns if their jobs will be replaced by artificial intelligence (AI). This controversial topic has drawn great attention in public. However, we are still at a stage of divergent opinions without any definitive conclusion. Motivated by this, we raise the research question of "is GPT-4 a good data analyst?" in this work and aim to answer it by conducting head-to-head comparative studies. In detail, we regard GPT-4 as a data analyst to perform end-to-end data analysis with databases from a wide range of domains. We propose a framework to tackle the problems by carefully designing the prompts for GPT-4 to conduct experiments. We also design several task-specific evaluation metrics to systematically compare the performance between several professional human data analysts and GPT-4. Experimental results show that GPT-4 can achieve comparable performance to humans. We also provide in-depth discussions about our results to shed light on further studies before reaching the conclusion that GPT-4 can replace data analysts.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes