LGJan 15, 2014

Adaptive Stochastic Resource Control: A Machine Learning Approach

arXiv:1401.3434v123 citations
Originality Synthesis-oriented
AI Analysis

This work addresses resource management problems like scheduling and transportation for industries, but it appears incremental as it builds on existing MDP and ADP techniques.

The paper tackles stochastic resource allocation with scarce, reusable resources and interconnected tasks by reformulating it as a Markov decision process and using approximate dynamic programming methods like fitted Q-learning with hash tables and support vector regression. Experimental results are presented on benchmark and industry-related data, but no concrete numbers are provided.

The paper investigates stochastic resource allocation problems with scarce, reusable resources and non-preemtive, time-dependent, interconnected tasks. This approach is a natural generalization of several standard resource management problems, such as scheduling and transportation problems. First, reactive solutions are considered and defined as control policies of suitably reformulated Markov decision processes (MDPs). We argue that this reformulation has several favorable properties, such as it has finite state and action spaces, it is aperiodic, hence all policies are proper and the space of control policies can be safely restricted. Next, approximate dynamic programming (ADP) methods, such as fitted Q-learning, are suggested for computing an efficient control policy. In order to compactly maintain the cost-to-go function, two representations are studied: hash tables and support vector regression (SVR), particularly, nu-SVRs. Several additional improvements, such as the application of limited-lookahead rollout algorithms in the initial phases, action space decomposition, task clustering and distributed sampling are investigated, too. Finally, experimental results on both benchmark and industry-related data are presented.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes