35.4SYApr 29
Regime-Adaptive Weighted Ensemble Learning for Computing-Driven Dynamic Load Forecasting in AI Data CentersZiying Wang, Ying Zhang, Lei Wang et al.
Short-term load forecasting for AI data centers presents new challenges because it is computing-driven, with heterogeneous job arrivals, sizes, and durations exhibiting bursty, non-stationary dynamics. Compared with traditional load types, data center loads are less researched and can pose greater threats to the efficiency and stability of power grids. To close the gap, this paper proposes a regime-adaptive ensemble learning forecasting algorithm to predict computing-driven dynamic workloads in AI data centers. A weight-learned neural network within an ensemble learning framework is developed to exploit the complementary strengths of two machine learning (ML) submodels across varying operating regimes. Furthermore, a novel feature engineering strategy is developed to incrementally learn from a non-stationary data stream. Thus, the ensemble weights are dynamically optimized to facilitate adaptive calibration of inter-submodel contributions. Comparative case studies on the MIT Supercloud dataset demonstrate that the proposed method significantly enhances load forecasting accuracy and adaptivity across various regimes, and the selected combination of ML models for ensemble learning outperforms other possible combinations. To the best of our knowledge, our method is the first to reduce minute-class forecasting errors for AI data center loads to below 1%, highlighting its potential for grid-interactive coordination and demand response.
AIAug 26, 2025
RLMR: Reinforcement Learning with Mixed Rewards for Creative WritingJianxing Liao, Tian Zhang, Xiao Feng et al.
Large language models are extensively utilized in creative writing applications. Creative writing requires a balance between subjective writing quality (e.g., literariness and emotional expression) and objective constraint following (e.g., format requirements and word limits). Existing methods find it difficult to balance these two aspects: single reward strategies fail to improve both abilities simultaneously, while fixed-weight mixed-reward methods lack the ability to adapt to different writing scenarios. To address this problem, we propose Reinforcement Learning with Mixed Rewards (RLMR), utilizing a dynamically mixed reward system from a writing reward model evaluating subjective writing quality and a constraint verification model assessing objective constraint following. The constraint following reward weight is adjusted dynamically according to the writing quality within sampled groups, ensuring that samples violating constraints get negative advantage in GRPO and thus penalized during training, which is the key innovation of this proposed method. We conduct automated and manual evaluations across diverse model families from 8B to 72B parameters. Additionally, we construct a real-world writing benchmark named WriteEval for comprehensive evaluation. Results illustrate that our method achieves consistent improvements in both instruction following (IFEval from 83.36% to 86.65%) and writing quality (72.75% win rate in manual expert pairwise evaluations on WriteEval). To the best of our knowledge, RLMR is the first work to combine subjective preferences with objective verification in online RL training, providing an effective solution for multi-dimensional creative writing optimization.