Comparing the costs of abstraction for DL frameworks
This study addresses the practical performance overheads for engineers and researchers using DL frameworks, identifying where the highest costs are incurred.
This paper investigates the performance costs associated with high-level abstractions in Deep Learning (DL) frameworks. It compares PyTorch, LibTorch, TorchScript, and cuDNN by training and evaluating a representative DL model, analyzing accuracy, execution time, and memory efficiency.
High level abstractions for implementing, training, and testing Deep Learning (DL) models abound. Such frameworks function primarily by abstracting away the implementation details of arbitrary neural architectures, thereby enabling researchers and engineers to focus on design. In principle, such frameworks could be "zero-cost abstractions"; in practice, they incur translation and indirection overheads. We study at which points exactly in the engineering life-cycle of a DL model the highest costs are paid and whether they can be mitigated. We train, test, and evaluate a representative DL model using PyTorch, LibTorch, TorchScript, and cuDNN on representative datasets, comparing accuracy, execution time and memory efficiency.