An Online Algorithm for Computation Offloading in Non-Stationary Environments
This addresses latency reduction for users in mobile or edge computing environments, but it is incremental as it builds on existing multi-armed bandit methods with a focus on non-stationary conditions.
The paper tackled the latency minimization problem in task-offloading scenarios with dynamic wireless links and computing resources by modeling server selection as a multi-armed bandit problem, resulting in a novel online learning algorithm that outperforms state-of-the-art algorithms by up to ~1s.
We consider the latency minimization problem in a task-offloading scenario, where multiple servers are available to the user equipment for outsourcing computational tasks. To account for the temporally dynamic nature of the wireless links and the availability of the computing resources, we model the server selection as a multi-armed bandit (MAB) problem. In the considered MAB framework, rewards are characterized in terms of the end-to-end latency. We propose a novel online learning algorithm based on the principle of optimism in the face of uncertainty, which outperforms the state-of-the-art algorithms by up to ~1s. Our results highlight the significance of heavily discounting the past rewards in dynamic environments.