LG AI NIMay 5, 2025

Adaptive Budgeted Multi-Armed Bandits for IoT with Dynamic Resource Constraints

Shubham Vaishnav, Praveen Kumar Donta, Sindri Magnússon

arXiv:2505.02640v24.11 citationsh-index: 26

Originality Incremental advance

AI Analysis

This addresses resource management for IoT devices with dynamic constraints, but it is incremental as it builds on existing bandit methods.

The paper tackles the problem of IoT systems needing to respond in real-time under fluctuating resource constraints by proposing a Budgeted Multi-Armed Bandit framework with a decaying violation budget, achieving sublinear regret and logarithmic constraint violations in simulations.

Internet of Things (IoT) systems increasingly operate in environments where devices must respond in real time while managing fluctuating resource constraints, including energy and bandwidth. Yet, current approaches often fall short in addressing scenarios where operational constraints evolve over time. To address these limitations, we propose a novel Budgeted Multi-Armed Bandit framework tailored for IoT applications with dynamic operational limits. Our model introduces a decaying violation budget, which permits limited constraint violations early in the learning process and gradually enforces stricter compliance over time. We present the Budgeted Upper Confidence Bound (UCB) algorithm, which adaptively balances performance optimization and compliance with time-varying constraints. We provide theoretical guarantees showing that Budgeted UCB achieves sublinear regret and logarithmic constraint violations over the learning horizon. Extensive simulations in a wireless communication setting show that our approach achieves faster adaptation and better constraint satisfaction than standard online learning methods. These results highlight the framework's potential for building adaptive, resource-aware IoT systems.

View on arXiv PDF

Similar