Method Drift›Agent / long-term memory
MemoryOS
Memory OS of AI AgentAgent / long-term memory · first seen May 30, 2025
superseded — cited as a baseline and beaten by newer methods
2 papers critique it · 4 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites MemoryOS as a baseline.
“By treating outgoing connections as equally valid or using fixed graph-expansion rules, existing systems can fail to discriminate between highly relevant pathways and distracting noise, leading to degraded retrieval accuracy as memory grows.”
— HAGE: Harnessing Agentic Memory via RL-Driven Weighted Graph Evolution“Blocking Latency: Achieving structural depth often comes at the cost of interactivity. Approaches like MemoryOS and synchronous graph builders typically require heavy LLM operations on the critical path. As noted in benchmarks wu2024longmemeval, such mechanisms incur prohibitive latency, rendering them impractical for real-time interaction.”
— MAGMA: A Multi-Graph based Agentic Memory Architecture for AI Agents
Beaten on benchmarks
Head-to-head results where a newer method reports beating MemoryOS. Values are copied from the source paper's tables — verify against the cited paper.
- HAGE: Harnessing Agentic Memory via RL-Driven Weighted Graph Evolution
HAGE beats MemoryOS · Overall [gpt-4o-mini]
0.739 vs 0.553
- HAGE: Harnessing Agentic Memory via RL-Driven Weighted Graph Evolution
HAGE beats MemoryOS · Overall [Qwen2.5-3B]
0.548 vs 0.280
- HAGE: Harnessing Agentic Memory via RL-Driven Weighted Graph Evolution
HAGE beats MemoryOS · LLM Score [GPT-4o-mini]
0.824 vs 0.592
- HAGE: Harnessing Agentic Memory via RL-Driven Weighted Graph Evolution
HAGE beats MemoryOS · F1 [GPT-4o-mini]
0.678 vs 0.477
- HAGE: Harnessing Agentic Memory via RL-Driven Weighted Graph Evolution
HAGE beats MemoryOS · LLM Score [Qwen2.5-3B]
0.527 vs 0.459
- HAGE: Harnessing Agentic Memory via RL-Driven Weighted Graph Evolution
HAGE beats MemoryOS · F1 [Qwen2.5-3B]
0.429 vs 0.350
- HAGE: Harnessing Agentic Memory via RL-Driven Weighted Graph Evolution
HAGE beats MemoryOS · Avg. Score [all]
0.739 vs 0.553
- General Agentic Memory Via Deep Research
GAM beats MemoryOS · LoCoMo Temporal F1 [GPT-4o-mini]
56.15 vs 41.15
- General Agentic Memory Via Deep Research
GAM beats MemoryOS · LoCoMo Temporal F1 [Qwen2.5-14b]
40.96 vs 32.24
- MAGMA: A Multi-Graph based Agentic Memory Architecture for AI Agents
MAGMA beats MemoryOS · Judge [Overall]
0.700 vs 0.553
- MAGMA: A Multi-Graph based Agentic Memory Architecture for AI Agents
MAGMA beats MemoryOS · Latency (s) [all methods]
1.47 vs 32.68
- Memory-R2: Fair Credit Assignment for Long-Horizon Memory-Augmented LLM Agents
Memory-R2 beats MemoryOS · F1 [Multi-hop]
38.41 vs 29.55
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- Jun 9, 2026
- May 30, 2026
- MemGuardMemGuard: Preventing Memory Contamination in Long-Term Memory-Augmented Large Language ModelsMay 27, 2026
- DeferMemDeferMem: Query-Time Evidence Distillation via Reinforcement Learning for Long-Term Memory QAMay 21, 2026
- May 20, 2026
- May 3, 2026
- Apr 23, 2026
- Apr 2, 2026
- ChronosChronos: Temporal-Aware Conversational Agents with Structured Event Retrieval for Long-Term MemoryMar 17, 2026
- Mar 15, 2026
- Jan 13, 2026
- Agentic Memory (AgeMem)Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model AgentsJan 5, 2026