Is MemGPT superseded?

MemGPT (Agent / long-term memory): superseded — cited as a baseline and beaten by newer methods. 4 paper(s) critique it, 3 beat it on benchmarks — #6 of 63 most-superseded. Sub-problem: cluster led by RAPTOR. Newer alternatives in the same sub-problem include HingeMem, AtomMem, REMem, Generative Semantic Workspace (GSW), CAM.

Method Drift›Agent / long-term memory

Superseded baseline#6 of 63 most-superseded

MemGPT

MemGPT: Towards LLMs as Operating Systems

Agent / long-term memory · first seen Oct 12, 2023

superseded — cited as a baseline and beaten by newer methods

4 papers critique it · 3 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites MemGPT as a baseline.

“However, these approaches face significant limitations in handling diverse real-world tasks. While they can provide basic memory functionality, their operations are typically constrained by predefined structures and fixed workflows. These constraints stem from their reliance on rigid operational patterns, particularly in memory writing and retrieval processes. Such inflexibility leads to poor generalization in new environments and limited effectiveness in long-term interactions.”
— A-MEM: Agentic Memory for LLM Agents
“they operate in an ``open loop'' without feedback on whether the constructed memories benefit downstream tasks”
— MemBuilder: Reinforcing LLMs for Long-Term Memory Construction via Attributed Dense Rewards
“While simple to implement, these unstructured memory designs fall short when critical information is dispersed across multiple entries.”
— CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension
“Early memory mechanisms in LLM-based agents typically relied on heuristic-based static workflows.”
— AtomMem : Learnable Dynamic Agentic Memory with Atomic Memory Operation

Beaten on benchmarks

Head-to-head results where a newer method reports beating MemGPT. Values are copied from the source paper's tables — verify against the cited paper.

Mnemosyne beats MemGPT · Overall (%) [LoCoMo benchmark]
54.55 vs 44.18
Mnemosyne: An Unsupervised, Human-Inspired Long-Term Memory Architecture for Edge-Based LLMs
A-MEM beats MemGPT · Multi Hop F1 [GPT 4o-mini]
27.02 vs 26.65
A-MEM: Agentic Memory for LLM Agents
A-MEM beats MemGPT · Multi Hop BLEU [GPT 4o-mini]
20.09 vs 17.72
A-MEM: Agentic Memory for LLM Agents
A-MEM beats MemGPT · Temporal F1 [GPT 4o-mini]
45.85 vs 25.52
A-MEM: Agentic Memory for LLM Agents
A-MEM beats MemGPT · Temporal BLEU [GPT 4o-mini]
36.67 vs 19.44
A-MEM: Agentic Memory for LLM Agents
A-MEM beats MemGPT · Open Domain F1 [GPT 4o-mini]
12.14 vs 9.15
A-MEM: Agentic Memory for LLM Agents
A-MEM beats MemGPT · Open Domain BLEU [GPT 4o-mini]
12.00 vs 7.44
A-MEM: Agentic Memory for LLM Agents
A-MEM beats MemGPT · Single Hop F1 [GPT 4o-mini]
44.65 vs 41.04
A-MEM: Agentic Memory for LLM Agents
A-MEM beats MemGPT · Single Hop BLEU [GPT 4o-mini]
37.06 vs 34.34
A-MEM: Agentic Memory for LLM Agents
A-MEM beats MemGPT · Multi Hop F1 [GPT 4o]
32.86 vs 30.36
A-MEM: Agentic Memory for LLM Agents
A-MEM beats MemGPT · Multi Hop BLEU [GPT 4o]
23.76 vs 22.83
A-MEM: Agentic Memory for LLM Agents
A-MEM beats MemGPT · Temporal F1 [GPT 4o]
39.41 vs 17.29
A-MEM: Agentic Memory for LLM Agents

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.