Reinforcement Learning and Adaptive Sampling for Optimized DNN Compilation
This work addresses the need for faster and more efficient DNN compilation, which is crucial for enabling innovation in neural network deployment, though it is incremental as it builds on existing methods like AutoTVM.
The paper tackles the problem of optimizing DNN compilation by formulating it as a reinforcement learning task with adaptive sampling, resulting in a 4.45x speedup in optimization time over AutoTVM and a 5.6% improvement in inference time.
Achieving faster execution with shorter compilation time can enable further diversity and innovation in neural networks. However, the current paradigm of executing neural networks either relies on hand-optimized libraries, traditional compilation heuristics, or very recently, simulated annealing and genetic algorithms. Our work takes a unique approach by formulating compiler optimizations for neural networks as a reinforcement learning problem, whose solution takes fewer steps to converge. This solution, dubbed ReLeASE, comes with a sampling algorithm that leverages clustering to focus the costly samples (real hardware measurements) on representative points, subsuming an entire subspace. Our adaptive sampling not only reduces the number of samples, but also improves the quality of samples for better exploration in shorter time. As such, experimentation with real hardware shows that reinforcement learning with adaptive sampling provides 4.45x speed up in optimization time over AutoTVM, while also improving inference time of the modern deep networks by 5.6%. Further experiments also confirm that our adaptive sampling can even improve AutoTVM's simulated annealing by 4.00x.