Method Drift›Agent / long-term memory
LightMem
LightMem: Lightweight and Efficient Memory-Augmented GenerationAgent / long-term memory · first seen Oct 21, 2025
superseded — cited as a baseline and beaten by newer methods
1 papers critique it · 4 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites LightMem as a baseline.
“Taking LightMem on LoCoMo as an example: open-domain (75.9% accuracy) and single-hop (70.7% accuracy) questions are handled well, but multi-hop reasoning drops to 60.6% and temporal reasoning to just 45.8%.”
— GRAVITY: Architecture-Agnostic Structured Anchoring for Long-Horizon Conversational Memory
Beaten on benchmarks
Head-to-head results where a newer method reports beating LightMem. Values are copied from the source paper's tables — verify against the cited paper.
- GRAVITY: Architecture-Agnostic Structured Anchoring for Long-Horizon Conversational Memory
GRAVITY beats LightMem · LLM-judge accuracy [LongMemEval Micro]
72.6 vs 68.8
- GRAVITY: Architecture-Agnostic Structured Anchoring for Long-Horizon Conversational Memory
GRAVITY beats LightMem · LLM-judge accuracy [LongMemEval Macro]
70.9 vs 67.1
- GRAVITY: Architecture-Agnostic Structured Anchoring for Long-Horizon Conversational Memory
GRAVITY beats LightMem · LLM-judge accuracy [LoCoMo]
75.8 vs 70.1
- DeltaMem: Towards Agentic Memory Management via Reinforcement Learning
DeltaMem beats LightMem · LJ [Overall]
75.13 vs 66.43
- DeltaMem: Towards Agentic Memory Management via Reinforcement Learning
DeltaMem beats LightMem · QA [Question Answering]
66.43 vs 58.38
- General Agentic Memory Via Deep Research
GAM beats LightMem · LongBenchv2 F1 [GPT-4o-mini]
73.59 vs 69.51
- DeferMem: Query-Time Evidence Distillation via Reinforcement Learning for Long-Term Memory QA
DeferMem beats LightMem · Accuracy [LongMemEval-S]
70.00 vs 68.64
- DeferMem: Query-Time Evidence Distillation via Reinforcement Learning for Long-Term Memory QA
DeferMem beats LightMem · Token cost [LongMemEval-S]
0.00 vs 28.25
- DeferMem: Query-Time Evidence Distillation via Reinforcement Learning for Long-Term Memory QA
DeferMem beats LightMem · Time cost [LongMemEval-S]
90.30 vs 283.76
- DeferMem: Query-Time Evidence Distillation via Reinforcement Learning for Long-Term Memory QA
DeferMem beats LightMem · Accuracy [LoCoMo (all convs) non-adversarial]
88.25 vs 72.99
- DeferMem: Query-Time Evidence Distillation via Reinforcement Learning for Long-Term Memory QA
DeferMem beats LightMem · Accuracy [LoCoMo (all convs) adversarial]
97.09 vs 90.81
- DeferMem: Query-Time Evidence Distillation via Reinforcement Learning for Long-Term Memory QA
DeferMem beats LightMem · Time cost [LoCoMo (all convs) non-adversarial]
83.56 vs 815.32
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- Jun 9, 2026
- May 30, 2026
- MemGuardMemGuard: Preventing Memory Contamination in Long-Term Memory-Augmented Large Language ModelsMay 27, 2026
- DeferMemDeferMem: Query-Time Evidence Distillation via Reinforcement Learning for Long-Term Memory QAMay 21, 2026
- May 20, 2026
- May 3, 2026
- Apr 23, 2026
- Apr 2, 2026
- ChronosChronos: Temporal-Aware Conversational Agents with Structured Event Retrieval for Long-Term MemoryMar 17, 2026
- Mar 15, 2026
- Jan 13, 2026
- Agentic Memory (AgeMem)Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model AgentsJan 5, 2026