Moses: Efficient Exploitation of Cross-device Transferable Features for Tensor Program Optimization
This addresses the labor-intensive need for training domain-specific predictors for new hardware platforms in DNN compilers, offering a more efficient solution for optimizing tensor programs across devices.
The paper tackles the problem of efficiently generating tensor programs for DNN compilers by proposing Moses, a design based on the lottery ticket hypothesis that exploits cross-device transferable features via domain adaptation, achieving up to 1.53x efficiency gain in search and 1.41x inference speedup on DNN benchmarks.
Achieving efficient execution of machine learning models has attracted significant attention recently. To generate tensor programs efficiently, a key component of DNN compilers is the cost model that can predict the performance of each configuration on specific devices. However, due to the rapid emergence of hardware platforms, it is increasingly labor-intensive to train domain-specific predictors for every new platform. Besides, current design of cost models cannot provide transferable features between different hardware accelerators efficiently and effectively. In this paper, we propose Moses, a simple and efficient design based on the lottery ticket hypothesis, which fully takes advantage of the features transferable to the target device via domain adaptation. Compared with state-of-the-art approaches, Moses achieves up to 1.53X efficiency gain in the search stage and 1.41X inference speedup on challenging DNN benchmarks.