CLMar 1Code
Thoth: Mid-Training Bridges LLMs to Time Series UnderstandingJiafeng Lin, Yuxuan Wang, Jialong Wu et al.
Large Language Models (LLMs) have demonstrated remarkable success in general-purpose reasoning. However, they still struggle to understand and reason about time series data, which limits their effectiveness in decision-making scenarios that depend on temporal dynamics. In this paper, we propose Thoth, the first family of mid-trained LLMs with general-purpose time series understanding capabilities. As a pivotal intermediate stage, mid-training achieves task- and domain-agnostic alignment between time series and natural language, for which we construct Book-of-Thoth, a high-quality, time-series-centric mid-training corpus. Book-of-Thoth enables both time-series-to-text and text-to-time-series generation, equipping LLMs with a foundational grasp of temporal patterns. To better evaluate advanced reasoning capabilities, we further present KnoTS, a novel benchmark of knowledge-intensive time series understanding, designed for joint reasoning over temporal patterns and domain knowledge. Extensive experiments demonstrate that mid-training with Book-of-Thoth enables Thoth to significantly outperform its base model and advanced LLMs across a range of time series question answering benchmarks. Moreover, Thoth exhibits superior capabilities when fine-tuned under data scarcity, underscoring the effectiveness of mid-training for time series understanding. Code is available at: https://github.com/thuml/Thoth.
LGFeb 25
TiMi: Empower Time Series Transformers with Multimodal Mixture of ExpertsJiafeng Lin, Yuxuan Wang, Huakun Luo et al.
Multimodal time series forecasting has garnered significant attention for its potential to provide more accurate predictions than traditional single-modality models by leveraging rich information inherent in other modalities. However, due to fundamental challenges in modality alignment, existing methods often struggle to effectively incorporate multimodal data into predictions, particularly textual information that has a causal influence on time series fluctuations, such as emergency reports and policy announcements. In this paper, we reflect on the role of textual information in numerical forecasting and propose Time series transformers with Multimodal Mixture-of-Experts, TiMi, to unleash the causal reasoning capabilities of LLMs. Concretely, TiMi utilizes LLMs to generate inferences on future developments, which serve as guidance for time series forecasting. To seamlessly integrate both exogenous factors and time series into predictions, we introduce a Multimodal Mixture-of-Experts (MMoE) module as a lightweight plug-in to empower Transformer-based time series models for multimodal forecasting, eliminating the need for explicit representation-level alignment. Experimentally, our proposed TiMi demonstrates consistent state-of-the-art performance on sixteen real-world multimodal forecasting benchmarks, outperforming advanced baselines while offering both strong adaptability and interpretability.
94.2SYMay 13
Battery-Assisted Operation of Hyperscale AI Data Centers under Connect-and-Manage Interconnection PracticesXin Lu, Jing Qiu, Jiafeng Lin et al.
Emerging connect-and-manage practices allow new transmission-connected mega-loads to connect while enforcing time-varying admissible power exchange limits at the point of common coupling (PCC) in real time. Hyperscale artificial intelligence data centers (AIDCs), whose demand can reach hundreds of megawatts and whose internal computing-cooling dynamics evolve rapidly, can therefore face frequent conflicts between workload continuity requirements and externally imposed PCC envelopes. This paper proposes a battery-assisted operational framework in which on-site battery energy storage (BESS) serves as a physical buffering interface to reconcile fast internal dynamics with time-varying interconnection limits. A continuity-aware energy-computation model is developed to jointly capture checkpoint-constrained AI training workloads, information technology (IT) computing power-throughput characteristics, and IT-cooling thermal dynamics. A two-stage decision framework is then formulated, consisting of scenario-based day-ahead workload commitment and a real-time receding-horizon delivery assurance controller that enforces battery, thermal, and grid-interaction constraints. Case studies on the IEEE 39-bus system with Australian real data demonstrate that BESS substantially increases credible day-ahead workload commitment and improves real-time delivery robustness under transmission congestion. Sensitivity analyses further reveal a regime-dependent role transition of BESS -- from feasibility-oriented continuity support when PCC limits are binding to economy-driven flexibility provision as transmission constraints are relaxed.
LGMar 5
Aura: Universal Multi-dimensional Exogenous Integration for Aviation Time SeriesJiafeng Lin, Mengren Zheng, Simeng Ye et al.
Time series forecasting has witnessed an increasing demand across diverse industrial applications, where accurate predictions are pivotal for informed decision-making. Beyond numerical time series data, reliable forecasting in practical scenarios requires integrating diverse exogenous factors. Such exogenous information is often multi-dimensional or even multimodal, introducing heterogeneous interactions that unimodal time series models struggle to capture. In this paper, we delve into an aviation maintenance scenario and identify three distinct types of exogenous factors that influence temporal dynamics through distinct interaction modes. Based on this empirical insight, we propose Aura, a universal framework that explicitly organizes and encodes heterogeneous external information according to its interaction mode with the target time series. Specifically, Aura utilizes a tailored tripartite encoding mechanism to embed heterogeneous features into well-established time series models, ensuring seamless integration of non-sequential context. Extensive experiments on a large-scale, three-year industrial dataset from China Southern Airlines, covering the Boeing 777 and Airbus A320 fleets, demonstrate that Aura consistently achieves state-of-the-art performance across all baselines and exhibits superior adaptability. Our findings highlight Aura's potential as a general-purpose enhancement for aviation safety and reliability.