DCMay 22

Flare: Leveraging Serverless Elasticity to Absorb Microservice Load Spikes

arXiv:2605.237074.2
AI Analysis

For online service providers, Flare offers a cost-effective solution to absorb load spikes without over-provisioning, addressing the scalability bottleneck of VM-based microservice deployments.

Flare proposes a hybrid microservice architecture combining VMs and serverless computing to handle unpredictable load spikes, reducing cost by 60% compared to over-provisioning while maintaining responsiveness.

Online services strive to maintain application responsiveness even when the traffic is unpredictable and fluctuating. Today's online services are commonly deployed as chains of microservices, each microservice packaged as one or more containers inside virtual machines (VMs). While performant and affordable when the load is steady, VM-based deployments are known to be slow to scale when the load spikes, resulting in degraded performance for end-users of the service. To avoid such performance degradations, service providers can over-provision their deployments; however, such a strategy is costly and inefficient, leaving resources under-utilized for extended periods. To address the challenge of unpredictable load spikes, we propose Flare, a hybrid microservice architecture that combines VMs with serverless computing. Flare utilizes VMs to cost-effectively handle steady workloads and leverages serverless elasticity to absorb traffic spikes. When a spike occurs, Flare detects which specific service(s) are overloaded and shifts the excess load of only those services to serverless, thus minimizing the cost overhead. Flare seamlessly integrates into existing auto-scaling and serverless infrastructure, requiring minimal changes to the control plane and no modifications to the application.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes