Generalizable Resource Allocation in Stream Processing via Deep Reinforcement Learning
This addresses a crucial bottleneck in distributed streaming systems for real-time data processing, offering a novel solution that improves throughput, though it is incremental in applying deep learning to an existing domain.
The paper tackles the NP-complete problem of resource allocation in stream processing by learning a generalizable strategy using a graph-aware encoder-decoder framework with deep reinforcement learning, outperforming state-of-the-art methods like METIS in about 70% of test cases.
This paper considers the problem of resource allocation in stream processing, where continuous data flows must be processed in real time in a large distributed system. To maximize system throughput, the resource allocation strategy that partitions the computation tasks of a stream processing graph onto computing devices must simultaneously balance workload distribution and minimize communication. Since this problem of graph partitioning is known to be NP-complete yet crucial to practical streaming systems, many heuristic-based algorithms have been developed to find reasonably good solutions. In this paper, we present a graph-aware encoder-decoder framework to learn a generalizable resource allocation strategy that can properly distribute computation tasks of stream processing graphs unobserved from training data. We, for the first time, propose to leverage graph embedding to learn the structural information of the stream processing graphs. Jointly trained with the graph-aware decoder using deep reinforcement learning, our approach can effectively find optimized solutions for unseen graphs. Our experiments show that the proposed model outperforms both METIS, a state-of-the-art graph partitioning algorithm, and an LSTM-based encoder-decoder model, in about 70% of the test cases.