Characterizing FaaS Workflows on Public Clouds: The Good, the Bad and the Ugly
For developers and researchers using FaaS workflows, this study provides a principled analysis of performance and cost behaviors, though it is incremental as it extends existing FaaS characterization to workflow platforms.
This paper characterizes three popular FaaS workflow platforms (AWS Step Functions and Azure Durable Functions) through extensive evaluations with 25 workflows over 132k invocations, revealing insights on execution, orchestration, scaling, and costs that help developers configure and program these platforms.
Function-as-a-service (FaaS) is a popular serverless computing paradigm for developing event-driven functions that elastically scale on public clouds. FaaS workflows, such as AWS Step Functions and Azure Durable Functions, are composed from FaaS functions, like AWS Lambda and Azure Functions, to build practical applications. But, the complex interactions between functions in the workflow and the limited visibility into the internals of proprietary FaaS platforms are major impediments to gaining a deeper understanding of FaaS workflow platforms. While several works characterize FaaS platforms to derive such insights, there is a lack of a principled and rigorous study for FaaS workflow platforms, which have unique scaling, performance and costing behavior influenced by the platform design, dataflow and workloads. In this article, we perform extensive evaluations of three popular FaaS workflow platforms from AWS and Azure, running 25 micro-benchmark and application workflows over 132k invocations. Our detailed analysis confirms some conventional wisdom but also uncovers unique insights on the function execution, workflow orchestration, inter-function interactions, cold-start scaling and monetary costs. Our observations help developers better configure and program these platforms, set performance and scalability expectations, and identify research gaps on enhancing the platforms.