TF2AIF: Facilitating development and deployment of accelerated AI models on the cloud-edge continuum
This work addresses the problem of specialized coding requirements for AI deployment in B5G/6G systems, facilitating research in resource management and automated operations, though it appears incremental as it builds upon existing tool-flows.
The paper tackles the challenge of efficiently deploying AI models across heterogeneous hardware in cloud-edge environments by introducing TF2AIF, a tool that automatically generates multiple software versions from high-level inputs like TensorFlow, enabling deployment on diverse platforms with minimal user effort.
The B5G/6G evolution relies on connect-compute technologies and highly heterogeneous clusters with HW accelerators, which require specialized coding to be efficiently utilized. The current paper proposes a custom tool for generating multiple SW versions of a certain AI function input in high-level language, e.g., Python TensorFlow, while targeting multiple diverse HW+SW platforms. TF2AIF builds upon disparate tool-flows to create a plethora of relative containers and enable the system orchestrator to deploy the requested function on any peculiar node in the cloud-edge continuum, i.e., to leverage the performance/energy benefits of the underlying HW upon any circumstances. TF2AIF fills an identified gap in today's ecosystem and facilitates research on resource management or automated operations, by demanding minimal time or expertise from users.