GSEM: Graph-based Self-Evolving Memory for Experience Augmented Clinical Reasoning
This addresses the challenge of improving clinical reasoning for decision-making agents by reducing noise and enhancing reliability in experience reuse, representing an incremental advancement in memory-augmented methods.
The paper tackles the problem of noisy retrieval and unreliable reuse in memory-augmented clinical decision-making by proposing GSEM, a graph-based self-evolving memory framework that organizes experiences into a dual-layer graph, achieving the highest average accuracy of 70.90% and 69.24% on benchmarks with two LLM backbones.
Clinical decision-making agents can benefit from reusing prior decision experience. However, many memory-augmented methods store experiences as independent records without explicit relational structure, which may introduce noisy retrieval, unreliable reuse, and in some cases even hurt performance compared to direct LLM inference. We propose GSEM (Graph-based Self-Evolving Memory), a clinical memory framework that organizes clinical experiences into a dual-layer memory graph, capturing both the decision structure within each experience and the relational dependencies across experiences, and supporting applicability-aware retrieval and online feedback-driven calibration of node quality and edge weights. Across MedR-Bench and MedAgentsBench with two LLM backbones, GSEM achieves the highest average accuracy among all baselines, reaching 70.90\% and 69.24\% with DeepSeek-V3.2 and Qwen3.5-35B, respectively. Code is available at https://github.com/xhan1022/gsem.