LGMar 16

SmartSearch: How Ranking Beats Structure for Conversational Memory Retrieval

arXiv:2603.1559966.11 citationsh-index: 3
AI Analysis

This work addresses efficient memory retrieval for conversational AI systems, presenting a significant improvement over existing methods but is incremental as it builds on known retrieval techniques.

The paper tackled the problem of conversational memory retrieval by showing that a deterministic pipeline with minimal learned components can outperform complex LLM-based systems, achieving 93.5% on LoCoMo and 88.4% on LongMemEval-S while using 8.5x fewer tokens than baselines.

Recent conversational memory systems invest heavily in LLM-based structuring at ingestion time and learned retrieval policies at query time. We show that neither is necessary. SmartSearch retrieves from raw, unstructured conversation history using a fully deterministic pipeline: NER-weighted substring matching for recall, rule-based entity discovery for multi-hop expansion, and a CrossEncoder+ColBERT rank fusion stage -- the only learned component -- running on CPU in ~650ms. Oracle analysis on two benchmarks identifies a compilation bottleneck: retrieval recall reaches 98.6%, but without intelligent ranking only 22.5% of gold evidence survives truncation to the token budget. With score-adaptive truncation and no per-dataset tuning, SmartSearch achieves 93.5% on LoCoMo and 88.4% on LongMemEval-S, exceeding all known memory systems under the same evaluation protocol on both benchmarks while using 8.5x fewer tokens than full-context baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes