LGMay 13

Evaluating Memory Condensation Strategies for Coding Agents in Data-Driven Scientific Discovery

Renuka Chintalapati, Sid Raskar, Anurag Acharya, Jared Willard, Patrick Emami, Sameera Horawalavithana

arXiv:2605.1885473.9

Predicted impact top 21% in LG · last 90 daysOriginality Synthesis-oriented

AI Analysis

This work provides the first systematic comparison of memory condensation strategies for coding agents in scientific discovery, revealing that current methods offer no benefit and often increase costs.

The paper evaluates eight memory condensation strategies for coding agents on 60 scientific discovery tasks across six domains, finding that no strategy significantly improves hypothesis quality, LLM-based condensers increase token costs by 24-94%, and masking tool-call outputs yields 8.6% net savings.

Coding agents accumulate extensive context during long-running tasks, yet fixed context windows force practitioners to choose between truncation and task failure. While numerous memory condensation strategies have been proposed, from simple sliding windows to LLM-generated summaries, no systematic comparison exists to guide strategy selection, especially in scientific discovery tasks. We evaluate eight memory condensation strategies using GPT-4o on sixty DiscoveryBench tasks spanning six scientific domains (480 total evaluations). We find that no condenser significantly alters hypothesis quality, while LLM-based condensers increase token costs by 24-94 percent, and masking tool-call outputs achieves an 8.6 percent net savings. We also observe that the optimal condenser for data-driven scientific discovery varies by scientific domain and task length.

View on arXiv PDF

Similar