Online Optimization for Randomized Network Resource Allocation with Long-Term Constraints
This work addresses resource allocation in communication networks with stochastic demands, offering an incremental improvement over existing methods for managing costs and constraints in such systems.
The paper tackles the problem of online resource reservation in a two-node network with uncertain job demands, aiming to minimize reservation costs while keeping cumulative violation and transport costs within a budget. It proposes an online saddle-point algorithm and provides theoretical bounds on regret and constraint violations, with numerical experiments comparing it to deterministic policies.
In this paper, we study an optimal online resource reservation problem in a simple communication network. The network is composed of two compute nodes linked by a local communication link. The system operates in discrete time; at each time slot, the administrator reserves resources for servers before the actual job requests are known. A cost is incurred for the reservations made. Then, after the client requests are observed, jobs may be transferred from one server to the other to best accommodate the demands by incurring an additional transport cost. If certain job requests cannot be satisfied, there is a violation that engenders a cost to pay for each of the blocked jobs. The goal is to minimize the overall reservation cost over finite horizons while maintaining the cumulative violation and transport costs under a certain budget limit. To study this problem, we first formalize it as a repeated game against nature where the reservations are drawn randomly according to a sequence of probability distributions that are derived from an online optimization problem over the space of allowable reservations. We then propose an online saddle-point algorithm for which we present an upper bound for the associated K-benchmark regret together with an upper bound for the cumulative constraint violations. Finally, we present numerical experiments where we compare the performance of our algorithm with those of simple deterministic resource allocation policies.