Method Drift›Agent / long-term memory
Mem0
Mem0: Building Production-Ready AI Agents with Scalable Long-Term MemoryAgent / long-term memory · first seen Apr 28, 2025
heavily superseded — a standard baseline that newer methods routinely beat
3 papers critique it · 15 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites Mem0 as a baseline.
“While graph databases provide structured organization for memory systems, their reliance on predefined schemas and relationships fundamentally limits their adaptability.”
— A-MEM: Agentic Memory for LLM Agents“LLMs cannot use memory tools effectively and using such tools increases redundancy in planning and overall tool use.”
— MEMTRACK: Evaluating Long-Term Memory and State Tracking in Multi-Platform Dynamic Agent Environments“they operate in an ``open loop'' without feedback on whether the constructed memories benefit downstream tasks”
— MemBuilder: Reinforcing LLMs for Long-Term Memory Construction via Attributed Dense Rewards
Beaten on benchmarks
Head-to-head results where a newer method reports beating Mem0. Values are copied from the source paper's tables — verify against the cited paper.
- Mnemosyne: An Unsupervised, Human-Inspired Long-Term Memory Architecture for Edge-Based LLMs
Mnemosyne beats Mem0 · Overall (%) [LoCoMo benchmark]
54.55 vs 45.68
- SwiftMem: Fast Agentic Memory via Query-aware Indexing
SwiftMem beats Mem0 · BLEU-1 [Temporal Reasoning]
0.569 vs 0.332
- SwiftMem: Fast Agentic Memory via Query-aware Indexing
SwiftMem beats Mem0 · BLEU-1 [Overall]
0.467 vs 0.365
- SwiftMem: Fast Agentic Memory via Query-aware Indexing
SwiftMem beats Mem0 · Search latency [Overall]
11 vs 784
- MemGuard: Preventing Memory Contamination in Long-Term Memory-Augmented Large Language Models
MemGuard beats Mem0 · Acc. (%) [HaluMem extraction]
89.53 vs 60.86
- MemGuard: Preventing Memory Contamination in Long-Term Memory-Augmented Large Language Models
MemGuard beats Mem0 · C [HaluMem update]
70.79 vs 25.50
- GRAVITY: Architecture-Agnostic Structured Anchoring for Long-Horizon Conversational Memory
GRAVITY beats Mem0 · LLM-judge accuracy [LongMemEval Micro]
64.0 vs 52.6
- GRAVITY: Architecture-Agnostic Structured Anchoring for Long-Horizon Conversational Memory
GRAVITY beats Mem0 · LLM-judge accuracy [LongMemEval Macro]
64.0 vs 51.7
- GRAVITY: Architecture-Agnostic Structured Anchoring for Long-Horizon Conversational Memory
GRAVITY beats Mem0 · LLM-judge accuracy [LoCoMo]
59.1 vs 51.0
- DeltaMem: Towards Agentic Memory Management via Reinforcement Learning
DeltaMem beats Mem0 · LJ [Overall]
75.13 vs 57.86
- DeltaMem: Towards Agentic Memory Management via Reinforcement Learning
DeltaMem beats Mem0 · QA [Question Answering]
66.43 vs 53.02
- MEMTRACK: Evaluating Long-Term Memory and State Tracking in Multi-Platform Dynamic Agent Environments
gpt-5+NoMem beats Mem0 · Efficiency [GPT-5]
0.667 vs 0.656
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- Jun 9, 2026
- May 30, 2026
- MemGuardMemGuard: Preventing Memory Contamination in Long-Term Memory-Augmented Large Language ModelsMay 27, 2026
- DeferMemDeferMem: Query-Time Evidence Distillation via Reinforcement Learning for Long-Term Memory QAMay 21, 2026
- May 20, 2026
- May 3, 2026
- Apr 23, 2026
- Apr 2, 2026
- ChronosChronos: Temporal-Aware Conversational Agents with Structured Event Retrieval for Long-Term MemoryMar 17, 2026
- Mar 15, 2026
- Jan 13, 2026
- Agentic Memory (AgeMem)Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model AgentsJan 5, 2026