DC CL IR NIMay 25

Neural Router: Semantic Content Matching for Agentic AI

Lauri Lovén, Abhishek Kumar, Alexander Engelhardt, Alaa Saleh, Roberto Morabito, Xiaoli Liu, Naser Hossein Motlagh, Sasu Tarkoma

arXiv:2605.2570142.71 citations

Predicted impact top 41% in DC · last 90 daysOriginality Incremental advance

AI Analysis

For practitioners deploying LLM-based semantic matching in edge-cloud systems, the paper provides actionable guidelines on when compression helps and when model scale is necessary.

The paper characterizes cost-accuracy trade-offs for using LLMs as semantic content matchers in publish/subscribe systems for agentic AI, identifying two crossovers: a context-window crossover where compression reduces LLM invocations, and a discrimination-capacity crossover where accuracy collapses regardless of context budget. Key findings show that above the discrimination crossover, only frontier-scale models work, and model selection dominates pipeline tuning.

Large language models (LLMs) can serve as the semantic-matching engine of a content-based publish/subscribe broker for agentic AI across the edge-cloud computing continuum, bridging the vocabulary and modality gaps that defeat keyword and embedding filters. Framed as offline multi-label retrieval over three public datasets spanning social-media, legal, and smart-home sensor domains (six LLMs, seven baselines), our central contribution is a two-crossover cost-accuracy characterisation: an analytical context-window crossover below which a CoverAndMerge compression pipeline reduces LLM invocations, and an empirical discrimination-capacity crossover above which matching accuracy collapses independently of context budget, by a model-dependent factor of parameter count and training generation. Two findings carry practical weight: above the discrimination crossover, compression cannot recover accuracy and only frontier-scale models clear large subscription sets; and there backend choice dominates configuration choice, so model selection, not pipeline tuning, is the primary operator lever. We accompany this with three composable algorithms and a per-cluster Quality-of-Experience framework for autonomic LLM-tier selection.

View on arXiv PDF

Similar