Robust Scheduling with GFlowNets
This work addresses a critical bottleneck in compiler optimization for developers and researchers, offering a robust method to improve scheduling performance, though it is incremental as it builds on existing GFlowNet techniques.
The paper tackles the NP-hard problem of scheduling operations in computation graphs for compiler optimization, where evaluating schedules on target hardware is time-consuming, and proposes a GFlowNet method that samples schedules proportionally to a proxy metric, demonstrating empirically that it outperforms pure optimization baselines on target hardware and generalizes to unseen scheduling problems.
Finding the best way to schedule operations in a computation graph is a classical NP-hard problem which is central to compiler optimization. However, evaluating the goodness of a schedule on the target hardware can be very time-consuming. Traditional approaches as well as previous machine learning ones typically optimize proxy metrics, which are fast to evaluate but can lead to bad schedules when tested on the target hardware. In this work, we propose a new approach to scheduling by sampling proportionally to the proxy metric using a novel GFlowNet method. We introduce a technique to control the trade-off between diversity and goodness of the proposed schedules at inference time and demonstrate empirically that the pure optimization baselines can lead to subpar performance with respect to our approach when tested on a target model. Furthermore, we show that conditioning the GFlowNet on the computation graph enables generalization to unseen scheduling problems for both synthetic and real-world compiler datasets.