Bring Your Own Codegen to Deep Learning Compiler
This addresses the problem for accelerator vendors by reducing development overhead, though it is incremental as it builds on existing compilers.
The paper tackles the challenge of developing and maintaining full compiler stacks for deep learning accelerators by proposing an open-source framework that allows vendors to reuse existing compiler components and focus on proprietary code generation, resulting in deployment in multiple commercial vendors' stacks with only a few thousand lines of code.
Deep neural networks (DNNs) have been ubiquitously applied in many applications, and accelerators are emerged as an enabler to support the fast and efficient inference tasks of these applications. However, to achieve high model coverage with high performance, each accelerator vendor has to develop a full compiler stack to ingest, optimize, and execute the DNNs. This poses significant challenges in the development and maintenance of the software stack. In addition, the vendors have to contiguously update their hardware and/or software to cope with the rapid evolution of the DNN model architectures and operators. To address these issues, this paper proposes an open source framework that enables users to only concentrate on the development of their proprietary code generation tools by reusing as many as possible components in the existing deep learning compilers. Our framework provides users flexible and easy-to-use interfaces to partition their models into segments that can be executed on "the best" processors to take advantage of the powerful computation capability of accelerators. Our case study shows that our framework has been deployed in multiple commercial vendors' compiler stacks with only a few thousand lines of code.