DCAISYJun 2, 2023

An Overview on Generative AI at Scale with Edge-Cloud Computing

arXiv:2306.17170v272 citationsh-index: 90
Originality Synthesis-oriented
AI Analysis

This is an incremental overview that synthesizes existing research on GenAI and edge-cloud computing to identify technical challenges and future directions for scalable deployment.

This overview paper addresses the challenge of scaling generative AI (GenAI) systems, which face high latency and resource demands in traditional cloud frameworks, by proposing the use of edge-cloud computing to reduce latency and handle large-scale data and requests.

As a specific category of artificial intelligence (AI), generative artificial intelligence (GenAI) generates new content that resembles what is created by humans. The rapid development of GenAI systems has created a huge amount of new data on the Internet, posing new challenges to current computing and communication frameworks. Currently, GenAI services rely on the traditional cloud computing framework due to the need for large computation resources. However, such services will encounter high latency because of data transmission and a high volume of requests. On the other hand, edge-cloud computing can provide adequate computation power and low latency at the same time through the collaboration between edges and the cloud. Thus, it is attractive to build GenAI systems at scale by leveraging the edge-cloud computing paradigm. In this overview paper, we review recent developments in GenAI and edge-cloud computing, respectively. Then, we use two exemplary GenAI applications to discuss technical challenges in scaling up their solutions using edge-cloud collaborative systems. Finally, we list design considerations for training and deploying GenAI systems at scale and point out future research directions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes