LG AP MLJan 18, 2022

System-Agnostic Meta-Learning for MDP-based Dynamic Scheduling via Descriptive Policy

arXiv:2201.07051v21.8

Originality Highly original

AI Analysis

This addresses the need for adaptable scheduling policies in practical applications like queuing and wireless networks, offering a novel approach beyond incremental improvements.

The paper tackles the problem of dynamic scheduling in systems with changing characteristics by proposing a descriptive policy that learns a system-agnostic scheduling principle, enabling adaptation to unseen systems with minimal performance degradation.

Dynamic scheduling is an important problem in applications from queuing to wireless networks. It addresses how to choose an item among multiple scheduling items in each timestep to achieve a long-term goal. Conventional approaches for dynamic scheduling find the optimal policy for a given specific system so that the policy from these approaches is usable only for the corresponding system characteristics. Hence, it is hard to use such approaches for a practical system in which system characteristics dynamically change. This paper proposes a novel policy structure for MDP-based dynamic scheduling, a descriptive policy, which has a system-agnostic capability to adapt to unseen system characteristics for an identical task (dynamic scheduling). To this end, the descriptive policy learns a system-agnostic scheduling principle--in a nutshell, "which condition of items should have a higher priority in scheduling". The scheduling principle can be applied to any system so that the descriptive policy learned in one system can be used for another system. Experiments with simple explanatory and realistic application scenarios demonstrate that it enables system-agnostic meta-learning with very little performance degradation compared with the system-specific conventional policies.

View on arXiv PDF

Similar