Is Data Interpreter superseded?

Data Interpreter (LLM reasoning / chain-of-thought): superseded — cited as a baseline and beaten by newer methods. 1 paper(s) critique it, 1 beat it on benchmarks — #43 of 772 most-superseded. Sub-problem: cluster led by ReAct. Newer alternatives in the same sub-problem include OLIVIA, Planner-centric Plan-Execute paradigm, SR^2, DS-STAR.

Method Drift›LLM reasoning / chain-of-thought

Superseded baseline#43 of 772 most-superseded

Data Interpreter

Data Interpreter: An LLM Agent For Data Science

LLM reasoning / chain-of-thought · first seen Feb 28, 2024

superseded — cited as a baseline and beaten by newer methods

1 papers critique it · 1 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites Data Interpreter as a baseline.

“a critical limitation of this approach is its reliance on successful code execution as the sole proxy for correctness. This often leads to sub-optimal plans, as execution success does not guarantee logical accuracy or alignment with user intent.”
— DS-STAR: Data Science Agent via Iterative Planning and Verification

Beaten on benchmarks

Head-to-head results where a newer method reports beating Data Interpreter. Values are copied from the source paper's tables — verify against the cited paper.

DS-STAR (Ours) beats Data Interpreter · Easy [Gemini-2.5-Pro]
87.50 vs 72.22
DS-STAR: Data Science Agent via Iterative Planning and Verification
DS-STAR (Ours) beats Data Interpreter · Hard [Gemini-2.5-Pro]
45.24 vs 3.44
DS-STAR: Data Science Agent via Iterative Planning and Verification
DS-STAR (Ours) beats Data Interpreter · Total [Original setting]
44.69 vs 31.32
DS-STAR: Data Science Agent via Iterative Planning and Verification
DS-STAR (Ours) beats Data Interpreter · Total [Oracle setting]
52.55 vs 33.57
DS-STAR: Data Science Agent via Iterative Planning and Verification

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.