Accelerated Digital Twin Learning for Edge AI: A Comparison of FPGA and Mobile GPU
This addresses the need for efficient digital twin learning in mission-critical healthcare applications like diabetes and heart disease detection, though it is incremental as it compares existing hardware implementations.
The paper tackles the problem of fast, resource-efficient digital twin learning for edge AI in healthcare by presenting a framework amenable to FPGA acceleration, achieving an 8.8x improvement in performance-per-watt, a 28.5x reduction in DRAM footprint, and a 1.67x runtime speedup compared to cloud GPU baselines.
Digital twins (DTs) can enable precision healthcare by continually learning a mathematical representation of patient-specific dynamics. However, mission critical healthcare applications require fast, resource-efficient DT learning, which is often infeasible with existing model recovery (MR) techniques due to their reliance on iterative solvers and high compute/memory demands. In this paper, we present a general DT learning framework that is amenable to acceleration on reconfigurable hardware such as FPGAs, enabling substantial speedup and energy efficiency. We compare our FPGA-based implementation with a multi-processing implementation in mobile GPU, which is a popular choice for AI in edge devices. Further, we compare both edge AI implementations with cloud GPU baseline. Specifically, our FPGA implementation achieves an 8.8x improvement in \text{performance-per-watt} for the MR task, a 28.5x reduction in DRAM footprint, and a 1.67x runtime speedup compared to cloud GPU baselines. On the other hand, mobile GPU achieves 2x better performance per watts but has 2x increase in runtime and 10x more DRAM footprint than FPGA. We show the usage of this technique in DT guided synthetic data generation for Type 1 Diabetes and proactive coronary artery disease detection.