C. Bash

54.2DCMay 13

Sustainable Graph Analytics Workload Scheduling with Evolutionary Reinforcement Learning in Edge-Cloud Systems

P. Ramicetty, H. Moore, S. Qi et al.

Graph analytics powers modern intelligent systems such as smart cities, cyber-physical infrastructure, IoT security, and large-scale social networks. As these workloads scale in complexity, their execution in heterogeneous edge-cloud environments results in higher energy use and carbon emission footprint. To address this challenge, we propose MERSEM, a multi-objective evolutionary reinforcement learning framework for sustainable edge-cloud system management. MERSEM integrates evolutionary search with reinforcement learning (RL) to solve the problem of graph workload allocation and scheduling. The evolutionary component explores diverse global solutions, while the RL agent refines decisions through adaptive local optimization. The framework is designed to jointly minimize service-level agreement (SLA) violations and carbon emissions by considering dynamic carbon intensity, resource heterogeneity, and workload characteristics. Experimental results demonstrate that MERSEM outperforms the state-of-the-art with up to 45% SLA violation reductions and up to 12% carbon emission reductions.

22.7DCMay 13

MARLIN: Multi-Agent Game-Theoretic Reinforcement Learning for Sustainable LLM Inference in Cloud Datacenters

H. Moore, S. Qi, D. Milojicic et al.

Large Language Models (LLMs) have become increasingly prevalent in cloud-based platforms, propelled by the introduction of AI-based consumer and enterprise services. LLM inference requests in particular account for up to 90% of total LLM lifecycle energy use, dwarfing training energy costs. The rising volume of LLM inference requests is increasing environmental footprints, particularly carbon emissions and water consumption. To improve sustainability for LLM inference serving in cloud datacenter environments, we propose a novel multi-agent game-theoretic reinforcement learning framework called MARLIN to co-optimize time-to-first token (TTFT), carbon emissions, water usage, and energy costs associated with LLM inference. MARLIN demonstrates a reduction of at least 18% in TTFT, 33% in carbon emissions, 43% in water usage, and 11% in energy costs compared to state-of-the-art LLM inference management frameworks.

C. Bash

2 Papers