Software Performance Engineering for Foundation Model-Powered Software (FMware)
This addresses performance issues in FMware for software engineers, but it is incremental as it surveys existing challenges without proposing new solutions.
The paper tackles the problem of performance engineering for foundation model-powered software (FMware), highlighting that it is often overlooked and leads to costly post-deployment optimizations, and identifies four key challenges based on literature and in-house development experiences.
The rise of Foundation Models (FMs) like Large Language Models (LLMs) is revolutionizing software development. Despite the impressive prototypes, transforming FMware into production-ready products demands complex engineering across various domains. A critical but overlooked aspect is performance engineering, which aims at ensuring FMware meets performance goals such as throughput and latency to avoid user dissatisfaction and financial loss. Often, performance considerations are an afterthought, leading to costly optimization efforts post-deployment. FMware's high computational resource demands highlight the need for efficient hardware use. Continuous performance engineering is essential to prevent degradation. This paper highlights the significance of Software Performance Engineering (SPE) in FMware, identifying four key challenges: cognitive architecture design, communication protocols, tuning and optimization, and deployment. These challenges are based on literature surveys and experiences from developing an in-house FMware system. We discuss problems, current practices, and innovative paths for the software engineering community.