SGDP: A Stream-Graph Neural Network Based Data Prefetcher
This work addresses the challenge of handling complex non-sequential access patterns in real-world storage systems, offering a novel solution that improves performance and robustness for system optimization.
The paper tackles the problem of data prefetching in storage systems by proposing SGDP, a stream-graph neural network-based prefetcher that models LBA delta streams with a weighted directed graph to capture spatial interdependencies, resulting in a 6.21% higher hit ratio, 7.00% higher effective prefetching ratio, and 3.13X faster inference time compared to state-of-the-art methods.
Data prefetching is important for storage system optimization and access performance improvement. Traditional prefetchers work well for mining access patterns of sequential logical block address (LBA) but cannot handle complex non-sequential patterns that commonly exist in real-world applications. The state-of-the-art (SOTA) learning-based prefetchers cover more LBA accesses. However, they do not adequately consider the spatial interdependencies between LBA deltas, which leads to limited performance and robustness. This paper proposes a novel Stream-Graph neural network-based Data Prefetcher (SGDP). Specifically, SGDP models LBA delta streams using a weighted directed graph structure to represent interactive relations among LBA deltas and further extracts hybrid features by graph neural networks for data prefetching. We conduct extensive experiments on eight real-world datasets. Empirical results verify that SGDP outperforms the SOTA methods in terms of the hit ratio by 6.21%, the effective prefetching ratio by 7.00%, and speeds up inference time by 3.13X on average. Besides, we generalize SGDP to different variants by different stream constructions, further expanding its application scenarios and demonstrating its robustness. SGDP offers a novel data prefetching solution and has been verified in commercial hybrid storage systems in the experimental phase. Our codes and appendix are available at https://github.com/yyysjz1997/SGDP/.