Active Inference-Based Adaptive Routing for Heterogeneous Edge AI Services
For edge computing systems, this work addresses the challenge of adaptive routing under dynamic workloads and infrastructure variability, but results are preliminary and lack concrete performance numbers.
AIF-Router uses Active Inference to autonomously balance latency, throughput, and resource utilization across heterogeneous edge AI services without offline training, demonstrating stable online learning despite device instability.
Edge computing enables AI inference closer to data sources, reducing latency and bandwidth costs. However, orchestrating AI services across the cloud-edge continuum remains challenging due to dynamic workloads and infrastructure variability. We present AIF-Router, an Active Inference--based routing framework that autonomously learns to balance latency, throughput, and resource utilization across multi-tier AI services without offline training. AIF-Router performs Bayesian state inference and expected free energy minimization to guide routing decisions based on observability-driven real-time metrics. Despite device instability on edge nodes, AIF-Router exhibits stable online learning behavior and demonstrates the feasibility of applying Active Inference for adaptive AI service orchestration in unreliable edge environments. Our findings highlight both the promise and practical challenges of deploying self-adaptive decision-making frameworks for real-world edge AI systems.