DC AI SYJun 2, 2023

An Overview on Generative AI at Scale with Edge-Cloud Computing

Yun-Cheng Wang, Jintang Xue, Chengwei Wei, C. -C. Jay Kuo

arXiv:2306.17170v28.672 citationsh-index: 90

Originality Synthesis-oriented

AI Analysis

This is an incremental overview that synthesizes existing research on GenAI and edge-cloud computing to identify technical challenges and future directions for scalable deployment.

This overview paper addresses the challenge of scaling generative AI (GenAI) systems, which face high latency and resource demands in traditional cloud frameworks, by proposing the use of edge-cloud computing to reduce latency and handle large-scale data and requests.

As a specific category of artificial intelligence (AI), generative artificial intelligence (GenAI) generates new content that resembles what is created by humans. The rapid development of GenAI systems has created a huge amount of new data on the Internet, posing new challenges to current computing and communication frameworks. Currently, GenAI services rely on the traditional cloud computing framework due to the need for large computation resources. However, such services will encounter high latency because of data transmission and a high volume of requests. On the other hand, edge-cloud computing can provide adequate computation power and low latency at the same time through the collaboration between edges and the cloud. Thus, it is attractive to build GenAI systems at scale by leveraging the edge-cloud computing paradigm. In this overview paper, we review recent developments in GenAI and edge-cloud computing, respectively. Then, we use two exemplary GenAI applications to discuss technical challenges in scaling up their solutions using edge-cloud collaborative systems. Finally, we list design considerations for training and deploying GenAI systems at scale and point out future research directions.

View on arXiv PDF

Similar