AI LGMay 11, 2018

Human-Machine Collaborative Optimization via Apprenticeship Scheduling

Matthew Gombolay, Reed Jensen, Jessica Stigile, Toni Golen, Neel Shah, Sung-Hyun Son, Julie Shah

arXiv:1805.04220v112.111 citations

Originality Incremental advance

AI Analysis

This addresses the problem of scaling human expertise in scheduling for domains like defense and healthcare, though it is incremental in applying apprenticeship learning to optimization.

The paper tackles the challenge of capturing human domain-expert heuristics for complex scheduling problems by proposing a pairwise ranking formulation, which learns policies from demonstrations and improves branch-and-bound search efficiency, achieving solutions up to 9.5 times faster and handling problems twice as complex as human experts.

Coordinating agents to complete a set of tasks with intercoupled temporal and resource constraints is computationally challenging, yet human domain experts can solve these difficult scheduling problems using paradigms learned through years of apprenticeship. A process for manually codifying this domain knowledge within a computational framework is necessary to scale beyond the ``single-expert, single-trainee" apprenticeship model. However, human domain experts often have difficulty describing their decision-making processes, causing the codification of this knowledge to become laborious. We propose a new approach for capturing domain-expert heuristics through a pairwise ranking formulation. Our approach is model-free and does not require enumerating or iterating through a large state space. We empirically demonstrate that this approach accurately learns multifaceted heuristics on a synthetic data set incorporating job-shop scheduling and vehicle routing problems, as well as on two real-world data sets consisting of demonstrations of experts solving a weapon-to-target assignment problem and a hospital resource allocation problem. We also demonstrate that policies learned from human scheduling demonstration via apprenticeship learning can substantially improve the efficiency of a branch-and-bound search for an optimal schedule. We employ this human-machine collaborative optimization technique on a variant of the weapon-to-target assignment problem. We demonstrate that this technique generates solutions substantially superior to those produced by human domain experts at a rate up to 9.5 times faster than an optimization approach and can be applied to optimally solve problems twice as complex as those solved by a human demonstrator.

View on arXiv PDF

Similar