首页 /研究 /Green-LLM: Optimal Workload Allocation for Environmentally-Aware Distributed Inference
OTHER

Green-LLM: Optimal Workload Allocation for Environmentally-Aware Distributed Inference

Jiaming Cheng, Duong Tung Nguyen

发表年份
2025
访问权限
开放获取

摘要

This paper investigates the optimal allocation of large language model (LLM) inference workloads across heterogeneous edge data centers over time. Each data center features on-site renewable generation and faces dynamic electricity prices and spatiotemporal variability in renewable availability. We propose Green-LLM, a lexicographic multi-objective optimization framework that addresses this challenge without requiring manual weight tuning. The proposed model incorporates real-world constraints, including token-dependent processing delay and energy consumption, heterogeneous hardware capabilities, dynamic renewable generation, and spatiotemporal variations in electricity prices and carbon intensity. Unlike existing approaches that optimize individual environmental metrics in isolation, Green-LLM jointly minimizes operational cost, carbon emissions, and delay penalty while enforcing water consumption constraints to ensure both sustainability and quality-of-service requirements. Numerical results demonstrate that Green-LLM achieves significant reductions in carbon emissions and water consumption while maintaining operational costs within 3% of the minimum and ensuring sub-2-second response latency. These findings show that sustainable LLM inference can be achieved without sacrificing service quality or economic efficiency.

关键词

cs.NIcs.DCeess.SYmath.OC

相关论文

查看 OTHER 分类全部论文