LG NIMar 13

PLUME: Building a Network-Native Foundation Model for Wireless Traces via Protocol-Aware Tokenization

Swadhin Pradhan, Shazal Irshad, Jerome Henry

arXiv:2603.1364722.6h-index: 9

AI Analysis

This enables on-prem, privacy-preserving root cause analysis for wireless networks, representing a domain-specific advancement.

The paper tackled the problem of building a foundation model for wireless packet traces by using protocol-aware tokenization to capture the native structure of the data, achieving 74-97% next-packet token accuracy and AUROC >= 0.99 for zero-shot anomaly detection with a compact 140M-parameter model.

Foundation models succeed when they learn in the native structure of a modality, whether morphology-respecting tokens in language or pixels in vision. Wireless packet traces deserve the same treatment: meaning emerges from layered headers, typed fields, timing gaps, and cross-packet state machines, not flat strings. We present Plume (Protocol Language Understanding Model for Exchanges), a compact 140M-parameter foundation model for 802.11 traces that learns from structured PDML dissections. A protocol-aware tokenizer splits along the dissector field tree, emits gap tokens for timing, and normalizes identifiers, yielding 6.2x shorter sequences than BPE with higher per token information density. Trained on a curated corpus, Plume achieves 74-97% next-packet token accuracy across five real-world failure categories and AUROC >= 0.99 for zero-shot anomaly detection. On the same prediction task, frontier LLMs (Claude Opus 4.6, GPT-5.4) score comparably despite receiving identical protocol context, yet Plume does so with > 600x fewer parameters, fitting on a single GPU at effectively zero marginal cost vs. cloud API pricing, enabling on-prem, privacy-preserving root cause analysis.

View on arXiv PDF

Similar