DCAILGOct 22, 2021

GCNScheduler: Scheduling Distributed Computing Applications using Graph Convolutional Networks

arXiv:2110.11552v116 citations
Originality Incremental advance
AI Analysis

This addresses the need for faster scheduling in dynamic distributed systems, offering a novel approach that is incremental in method but provides significant speed improvements.

The paper tackles the problem of scheduling task graphs on distributed computing systems by proposing GCNScheduler, a graph convolutional network-based scheduler, which achieves better makespan than HEFT and similar throughput to TP-HEFT while being orders of magnitude faster, e.g., scheduling 50-node graphs in 4 ms vs. 1500 seconds for HEFT.

We consider the classical problem of scheduling task graphs corresponding to complex applications on distributed computing systems. A number of heuristics have been previously proposed to optimize task scheduling with respect to metrics such as makespan and throughput. However, they tend to be slow to run, particularly for larger problem instances, limiting their applicability in more dynamic systems. Motivated by the goal of solving these problems more rapidly, we propose, for the first time, a graph convolutional network-based scheduler (GCNScheduler). By carefully integrating an inter-task data dependency structure with network settings into an input graph and feeding it to an appropriate GCN, the GCNScheduler can efficiently schedule tasks of complex applications for a given objective. We evaluate our scheme with baselines through simulations. We show that not only can our scheme quickly and efficiently learn from existing scheduling schemes, but also it can easily be applied to large-scale settings where current scheduling schemes fail to handle. We show that it achieves better makespan than the classic HEFT algorithm, and almost the same throughput as throughput-oriented HEFT (TP-HEFT), while providing several orders of magnitude faster scheduling times in both cases. For example, for makespan minimization, GCNScheduler schedules 50-node task graphs in about 4 milliseconds while HEFT takes more than 1500 seconds; and for throughput maximization, GCNScheduler schedules 100-node task graphs in about 3.3 milliseconds, compared to about 6.9 seconds for TP-HEFT.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes