Tao Xiong

LG
h-index16
22papers
1,244citations
Novelty47%
AI Score57

22 Papers

AIJun 2Code
DeskCraft: Benchmarking Desktop Agents on Professional Workflows and Human-in-the-Loop Collaboration

Wenkai Wang, Tao Xiong, Jingchen Ni et al.

Real-world professional desktop workflows in specialized creative and engineering software unfold over long horizons and often require human-in-the-loop coordination, where agents proactively seek necessary information and users provide additional instructions, clarifications, feedback, or corrections as the task progresses. Yet existing desktop GUI benchmarks mostly reduce this setting to short, simplified tasks with all user instructions provided upfront. To address this issue, we introduce DeskCraft, a desktop GUI benchmark targeting long horizon creative and engineering workflows and proactive human-agent collaboration. DeskCraft organizes tasks into a multilevel difficulty taxonomy, with long horizon tasks requiring over 50 execution steps, and covers professional creative software across design, video, audio, and 3D creation. Furthermore, DeskCraft formalizes human-agent collaboration into an interaction protocol covering mid-turn and post-turn exchanges. Mid-turn interaction captures both agent-initiated clarification under uncertainty and user-initiated interruption during execution, while post-turn interaction accommodates user-driven feedback after the agent signals completion, together spanning the full space of realistic collaboration patterns. We evaluate 18 proprietary and open source agents on 538 tasks and find that GPT-5.4 reaches 31.6% on standard tasks and 27.6% on interactive tasks. Further analyses reveal persistent failures in long horizon workflow delivery and proactive clarification. We will open-source all evaluation codes, tasks, and data at https://github.com/mrwwk/DeskCraft.

NAMar 12, 2018
A Hybrid Discontinuous Galerkin Scheme for Multi-scale Kinetic Equations

Francis Filbet, Tao Xiong

We develop a multi-dimensional hybrid discontinuous Galerkin method for multi-scale kinetic equations. This method is based on moment realizability matrices, a concept introduced by D. Levermore, W. Morokoff and B. Nadiga for one dimensional problem. The main issue addressed in this paper is to provide a simple indicator to select the most appropriate model and to apply a compact numerical scheme to reduce the interface region between different models. We also construct a numerical flux for the fluid model obtained as the asymptotic limit of the flux of the kinetic equation. Finally we perform several numerical simulations for time evolution and stationary problems.

NAFeb 6, 2016
High Order Hierarchical Asymptotic Preserving Nodal Discontinuous Galerkin IMEX Schemes For The BGK Equation

Tao Xiong, Jingmei Qiu

A class of high order asymptotic preserving (AP) schemes has been developed for the BGK equation in Xiong et. al. (2015) [37], which is based on the micro-macro formulation of the equation. The nodal discontinuous Galerkin (NDG) method with Lagrangian basis functions for spatial discretization and globally stiffly accurate implicit-explicit (IMEX) Runge-Kutta (RK) scheme as time discretization are introduced with asymptotic preserving properties. However, it is only necessary to solve the kinetic equation when the hydrodynamic description breaks down. Motivated by the recent work in Filbet and Rey (2015) [23], it is more naturally to construct a hierarchy scheme under the NDG-IMEX framework without hybridization, as the formal analysis in [37] shows that when $ε$ is small, the NDG-IMEX scheme becomes a local discontinuous Galerkin (LDG) scheme for the compressible Navier-Stokes equations, and when $ε=0$ it is a discontinuous Galerkin (DG) scheme for the compressible Euler equations. Moveover, we propose to combine the kinetic regime with the hydrodynamic regime including both the compressible Euler and Navier-Stokes equations. Numerical experiments demonstrate very decent performance of the new approach. In our numerics, all three regimes are clearly divided, leading to great savings in terms of the computational cost.

AIAug 6, 2025Code
OS Agents: A Survey on MLLM-based Agents for General Computing Devices Use

Xueyu Hu, Tao Xiong, Biao Yi et al.

The dream to create AI assistants as capable and versatile as the fictional J.A.R.V.I.S from Iron Man has long captivated imaginations. With the evolution of (multi-modal) large language models ((M)LLMs), this dream is closer to reality, as (M)LLM-based Agents using computing devices (e.g., computers and mobile phones) by operating within the environments and interfaces (e.g., Graphical User Interface (GUI)) provided by operating systems (OS) to automate tasks have significantly advanced. This paper presents a comprehensive survey of these advanced agents, designated as OS Agents. We begin by elucidating the fundamentals of OS Agents, exploring their key components including the environment, observation space, and action space, and outlining essential capabilities such as understanding, planning, and grounding. We then examine methodologies for constructing OS Agents, focusing on domain-specific foundation models and agent frameworks. A detailed review of evaluation protocols and benchmarks highlights how OS Agents are assessed across diverse tasks. Finally, we discuss current challenges and identify promising directions for future research, including safety and privacy, personalization and self-evolution. This survey aims to consolidate the state of OS Agents research, providing insights to guide both academic inquiry and industrial development. An open-source GitHub repository is maintained as a dynamic resource to foster further innovation in this field. We present a 9-page version of our work, accepted by ACL 2025, to provide a concise overview to the domain.

LGJun 21, 2022
sqSGD: Locally Private and Communication Efficient Federated Learning

Yan Feng, Tao Xiong, Ruofan Wu et al.

Federated learning (FL) is a technique that trains machine learning models from decentralized data sources. We study FL under local notions of privacy constraints, which provides strong protection against sensitive data disclosures via obfuscating the data before leaving the client. We identify two major concerns in designing practical privacy-preserving FL algorithms: communication efficiency and high-dimensional compatibility. We then develop a gradient-based learning algorithm called \emph{sqSGD} (selective quantized stochastic gradient descent) that addresses both concerns. The proposed algorithm is based on a novel privacy-preserving quantization scheme that uses a constant number of bits per dimension per client. Then we improve the base algorithm in three ways: first, we apply a gradient subsampling strategy that simultaneously offers better training performance and smaller communication costs under a fixed privacy budget. Secondly, we utilize randomized rotation as a preprocessing step to reduce quantization error. Thirdly, an adaptive gradient norm upper bound shrinkage strategy is adopted to improve accuracy and stabilize training. Finally, the practicality of the proposed framework is demonstrated on benchmark datasets. Experiment results show that sqSGD successfully learns large models like LeNet and ResNet with local privacy constraints. In addition, with fixed privacy and communication level, the performance of sqSGD significantly dominates that of various baseline algorithms.

MLDec 3, 2018Code
Sensitivity based Neural Networks Explanations

Enguerrand Horel, Virgile Mison, Tao Xiong et al.

Although neural networks can achieve very high predictive performance on various different tasks such as image recognition or natural language processing, they are often considered as opaque "black boxes". The difficulty of interpreting the predictions of a neural network often prevents its use in fields where explainability is important, such as the financial industry where regulators and auditors often insist on this aspect. In this paper, we present a way to assess the relative input features importance of a neural network based on the sensitivity of the model output with respect to its input. This method has the advantage of being fast to compute, it can provide both global and local levels of explanations and is applicable for many types of neural network architectures. We illustrate the performance of this method on both synthetic and real data and compare it with other interpretation techniques. This method is implemented into an open-source Python package that allows its users to easily generate and visualize explanations for their neural networks.

NAMar 27
Discrete hypocoercive estimates for discontinuous Galerkin methods: application to the Vlasov-Poisson-Fokker-Planck system

Yi Cai, Alain Blaustein, Tao Xiong et al.

We develop and analyze a class of structure-preserving discontinuous Galerkin schemes for the nonlinear Vlasov-Poisson-Fokker-Planck model, reformulated as a hyperbolic system through a Hermite expansion in the velocity variable. We discretize the Vlasov-Fokker-Planck equation with the discontinuous Galerkin method, while the Poisson equation is approximated with either a discontinuous Galerkin method or a Raviart-Thomas mixed finite element method. We prove the exponential relaxation to equilibrium for suitable initial data, uniformly with respect to the discretization parameters thanks to discrete hypocoercivity arguments. Moreover, we check that the resulting semi-discrete schemes preserve the physical invariants along with the L 2 variational structure of the linearized model. Numerical simulations verify the accuracy and the long-time behavior of the scheme.

NAApr 26
Asymptotic preserving scheme for the shallow water equations with non-flat bottom topography and Manning friction term

Guanlan Huang, Sebastiano Boscarino, Tao Xiong

In our previous work [29], we proposed a class of high-order asymptotic preserving (AP) finite difference weighted essentially non-oscillatory (WENO) schemes for solving the shallow water equations (SWEs) with bottom topography and Manning friction, utilizing a penalization technique inspired by [6]. Although the added weighted diffusive term enhanced stability, it increased computational cost and slowed down the convergence rate in the intermediate regime between convection and diffusion. In this paper, we extend our previous study by removing the penalization while preserving the AP property. To achieve this, we employ a high order semi-implicit implicit-explicit Runge-Kutta (SI-IMEX-RK) time discretization, coupled with the high-order WENO reconstruction for first-order derivatives and a central difference scheme for second-order spatial derivatives. This combination yields a class of fully high-order schemes. Theoretical analysis and numerical experiments demonstrate that the proposed schemes retain AP, asymptotically accurate (AA) and well-balanced properties, while offering higher computational efficiency compared to our previous schemes in [29], especially in the intermediate regime between convection and diffusion. Moreover, treating the momentum in the friction terms implicitly is essential for preserving the AP property; otherwise, the scheme fails to converge to the limiting equations. This indicates that implicit treatment of Manning friction is necessary for the stability of the method.

CLJul 1, 2025
Mixture of Reasonings: Teach Large Language Models to Reason with Adaptive Strategies

Tao Xiong, Xavier Hu, Wenyan Fan et al.

Large language models (LLMs) excel in complex tasks through advanced prompting techniques like Chain-of-Thought (CoT) and Tree-of-Thought (ToT), but their reliance on manually crafted, task-specific prompts limits adaptability and efficiency. We introduce Mixture of Reasoning (MoR), a training framework that embeds diverse reasoning strategies into LLMs for autonomous, task-adaptive reasoning without external prompt engineering. MoR has two phases: Thought Generation, creating reasoning chain templates with models like GPT-4o, and SFT Dataset Construction, pairing templates with benchmark datasets for supervised fine-tuning. Our experiments show that MoR significantly enhances performance, with MoR150 achieving 0.730 (2.2% improvement) using CoT prompting and 0.734 (13.5% improvement) compared to baselines. MoR eliminates the need for task-specific prompts, offering a generalizable solution for robust reasoning across diverse tasks.

AISep 27, 2025
GUI-PRA: Process Reward Agent for GUI Tasks

Tao Xiong, Xavier Hu, Yurun Chen et al.

Graphical User Interface (GUI) Agents powered by Multimodal Large Language Models (MLLMs) show significant potential for automating tasks. However, they often struggle with long-horizon tasks, leading to frequent failures. Process Reward Models (PRMs) are a promising solution, as they can guide these agents with crucial process signals during inference. Nevertheless, their application to the GUI domain presents unique challenges. When processing dense artificial inputs with long history data, PRMs suffer from a "lost in the middle" phenomenon, where the overwhelming historical context compromises the evaluation of the current step. Furthermore, standard PRMs lacks GUI changing awareness, providing static evaluations that are disconnected from the dynamic consequences of actions, a critical mismatch with the inherently dynamic nature of GUI tasks. In response to these challenges, we introduce GUI-PRA (Process Reward Agent for GUI Tasks), a judge agent designed to better provide process reward than standard PRM by intelligently processing historical context and actively perceiving UI state changes. Specifically, to directly combat the ``lost in the middle'' phenomenon, we introduce a dynamic memory mechanism consisting of two core components: a Relevance-based Retrieval Module to actively fetch pertinent information from long histories and a Progressive Summarization Module to dynamically condense growing interaction data, ensuring the model focuses on relevant context. Moreover, to address the lack of UI changing awareness, we introduce an Aadaptive UI Perception mechanism. This mechanism enables the agent to reason about UI state changes and dynamically select the most appropriate tool to gather grounded visual evidence, ensuring its evaluation is always informed by the current UI context.

LGJul 3, 2021
SHORING: Design Provable Conditional High-Order Interaction Network via Symbolic Testing

Hui Li, Xing Fu, Ruofan Wu et al.

Deep learning provides a promising way to extract effective representations from raw data in an end-to-end fashion and has proven its effectiveness in various domains such as computer vision, natural language processing, etc. However, in domains such as content/product recommendation and risk management, where sequence of event data is the most used raw data form and experts derived features are more commonly used, deep learning models struggle to dominate the game. In this paper, we propose a symbolic testing framework that helps to answer the question of what kinds of expert-derived features could be learned by a neural network. Inspired by this testing framework, we introduce an efficient architecture named SHORING, which contains two components: \textit{event network} and \textit{sequence network}. The \textit{event} network learns arbitrarily yet efficiently high-order \textit{event-level} embeddings via a provable reparameterization trick, the \textit{sequence} network aggregates from sequence of \textit{event-level} embeddings. We argue that SHORING is capable of learning certain standard symbolic expressions which the standard multi-head self-attention network fails to learn, and conduct comprehensive experiments and ablation studies on four synthetic datasets and three real-world datasets. The results show that SHORING empirically outperforms the state-of-the-art methods.

CVMar 18, 2019
Bilinear Representation for Language-based Image Editing Using Conditional Generative Adversarial Networks

Xiaofeng Mao, Yuefeng Chen, Yuhong Li et al.

The task of Language-Based Image Editing (LBIE) aims at generating a target image by editing the source image based on the given language description. The main challenge of LBIE is to disentangle the semantics in image and text and then combine them to generate realistic images. Therefore, the editing performance is heavily dependent on the learned representation. In this work, conditional generative adversarial network (cGAN) is utilized for LBIE. We find that existing conditioning methods in cGAN lack of representation power as they cannot learn the second-order correlation between two conditioning vectors. To solve this problem, we propose an improved conditional layer named Bilinear Residual Layer (BRL) to learning more powerful representations for LBIE task. Qualitative and quantitative comparisons demonstrate that our method can generate images with higher quality when compared to previous LBIE techniques.

CVMay 17, 2017
Automatic Vertebra Labeling in Large-Scale 3D CT using Deep Image-to-Image Network with Message Passing and Sparsity Regularization

Dong Yang, Tao Xiong, Daguang Xu et al.

Automatic localization and labeling of vertebra in 3D medical images plays an important role in many clinical tasks, including pathological diagnosis, surgical planning and postoperative assessment. However, the unusual conditions of pathological cases, such as the abnormal spine curvature, bright visual imaging artifacts caused by metal implants, and the limited field of view, increase the difficulties of accurate localization. In this paper, we propose an automatic and fast algorithm to localize and label the vertebra centroids in 3D CT volumes. First, we deploy a deep image-to-image network (DI2IN) to initialize vertebra locations, employing the convolutional encoder-decoder architecture together with multi-level feature concatenation and deep supervision. Next, the centroid probability maps from DI2IN are iteratively evolved with the message passing schemes based on the mutual relation of vertebra centroids. Finally, the localization results are refined with sparsity regularization. The proposed method is evaluated on a public dataset of 302 spine CT volumes with various pathologies. Our method outperforms other state-of-the-art methods in terms of localization accuracy. The run time is around 3 seconds on average per case. To further boost the performance, we retrain the DI2IN on additional 1000+ 3D CT volumes from different patients. To the best of our knowledge, this is the first time more than 1000 3D CT volumes with expert annotation are adopted in experiments for the anatomic landmark detection tasks. Our experimental results show that training with such a large dataset significantly improves the performance and the overall identification rate, for the first time by our knowledge, reaches 90 %.

NAJun 23, 2017
A high order bound preserving finite difference linear scheme for incompressible flows

Tao Xiong

We propose a high order finite difference linear scheme combined with a high order bound preserving maximum-principle-preserving (MPP) flux limiter to solve the incompressible flow system. For such problem with highly oscillatory structure but not strong shocks, our approach seems to be less dissipative and much less costly than a WENO type scheme, and has high resolution due to a Hermite reconstruction. Spurious numerical oscillations can be controlled by the MPP flux limiter. Numerical tests are performed for the Vlasov-Poisson system, the 2D guiding-center model and the incompressible Euler system. The comparison between the linear and WENO type schemes will demonstrate the good performance of our proposed approach.

NAJul 25, 2016
Conservative Multi-Dimensional Semi-Lagrangian Finite Difference Scheme: Stability and Applications to the Kinetic and Fluid Simulations

Tao Xiong, Giovanni Russo, Jing-Mei Qiu

In this paper, we propose a mass conservative semi-Lagrangian finite difference scheme for multi-dimensional problems without dimensional splitting. The semi-Lagrangian scheme, based on tracing characteristics backward in time from grid points, does not necessarily conserve the total mass. To ensure mass conservation, we propose a conservative correction procedure based on a flux difference form. Such procedure guarantees local mass conservation, while introducing time step constraints for stability. We theoretically investigate such stability constraints from an ODE point of view by assuming exact evaluation of spatial differential operators and from the Fourier analysis for linear PDEs. The scheme is tested by classical two dimensional linear passive-transport problems, such as linear advection, rotation and swirling deformation. The scheme is applied to solve the nonlinear Vlasov-Poisson system using a a high order tracing mechanism proposed in [Qiu and Russo, 2016]. Such high order characteristics tracing scheme is generalized to the nonlinear guiding center Vlasov model and incompressible Euler system. The effectiveness of the proposed conservative semi-Lagrangian scheme is demonstrated numerically by our extensive numerical tests.

LGJun 15, 2014
Interval Forecasting of Electricity Demand: A Novel Bivariate EMD-based Support Vector Regression Modeling Framework

Tao Xiong, Yukun Bao, Zhongyi Hu

Highly accurate interval forecasting of electricity demand is fundamental to the success of reducing the risk when making power system planning and operational decisions by providing a range rather than point estimation. In this study, a novel modeling framework integrating bivariate empirical mode decomposition (BEMD) and support vector regression (SVR), extended from the well-established empirical mode decomposition (EMD) based time series modeling framework in the energy demand forecasting literature, is proposed for interval forecasting of electricity demand. The novelty of this study arises from the employment of BEMD, a new extension of classical empirical model decomposition (EMD) destined to handle bivariate time series treated as complex-valued time series, as decomposition method instead of classical EMD only capable of decomposing one-dimensional single-valued time series. This proposed modeling framework is endowed with BEMD to decompose simultaneously both the lower and upper bounds time series, constructed in forms of complex-valued time series, of electricity demand on a monthly per hour basis, resulting in capturing the potential interrelationship between lower and upper bounds. The proposed modeling framework is justified with monthly interval-valued electricity demand data per hour in Pennsylvania-New Jersey-Maryland Interconnection, indicating it as a promising method for interval-valued electricity demand forecasting.

LGJan 11, 2014
Multi-Step-Ahead Time Series Prediction using Multiple-Output Support Vector Regression

Yukun Bao, Tao Xiong, Zhongyi Hu

Accurate time series prediction over long future horizons is challenging and of great interest to both practitioners and academics. As a well-known intelligent algorithm, the standard formulation of Support Vector Regression (SVR) could be taken for multi-step-ahead time series prediction, only relying either on iterated strategy or direct strategy. This study proposes a novel multiple-step-ahead time series prediction approach which employs multiple-output support vector regression (M-SVR) with multiple-input multiple-output (MIMO) prediction strategy. In addition, the rank of three leading prediction strategies with SVR is comparatively examined, providing practical implications on the selection of the prediction strategy for multi-step-ahead forecasting while taking SVR as modeling technique. The proposed approach is validated with the simulated and real datasets. The quantitative and comprehensive assessments are performed on the basis of the prediction accuracy and computational cost. The results indicate that: 1) the M-SVR using MIMO strategy achieves the best accurate forecasts with accredited computational load, 2) the standard SVR using direct strategy achieves the second best accurate forecasts, but with the most expensive computational cost, and 3) the standard SVR using iterated strategy is the worst in terms of prediction accuracy, but with the least computational cost.

AIJan 11, 2014
Does Restraining End Effect Matter in EMD-Based Modeling Framework for Time Series Prediction? Some Experimental Evidences

Tao Xiong, Yukun Bao, Zhongyi Hu

Following the "decomposition-and-ensemble" principle, the empirical mode decomposition (EMD)-based modeling framework has been widely used as a promising alternative for nonlinear and nonstationary time series modeling and prediction. The end effect, which occurs during the sifting process of EMD and is apt to distort the decomposed sub-series and hurt the modeling process followed, however, has been ignored in previous studies. Addressing the end effect issue, this study proposes to incorporate end condition methods into EMD-based decomposition and ensemble modeling framework for one- and multi-step ahead time series prediction. Four well-established end condition methods, Mirror method, Coughlin's method, Slope-based method, and Rato's method, are selected, and support vector regression (SVR) is employed as the modeling technique. For the purpose of justification and comparison, well-known NN3 competition data sets are used and four well-established prediction models are selected as benchmarks. The experimental results demonstrated that significant improvement can be achieved by the proposed EMD-based SVR models with end condition methods. The EMD-SBM-SVR model and EMD-Rato-SVR model, in particular, achieved the best prediction performances in terms of goodness of forecast measures and equality of accuracy of competing forecasts test.

LGJan 9, 2014
A PSO and Pattern Search based Memetic Algorithm for SVMs Parameters Optimization

Yukun Bao, Zhongyi Hu, Tao Xiong

Addressing the issue of SVMs parameters optimization, this study proposes an efficient memetic algorithm based on Particle Swarm Optimization algorithm (PSO) and Pattern Search (PS). In the proposed memetic algorithm, PSO is responsible for exploration of the search space and the detection of the potential regions with optimum solutions, while pattern search (PS) is used to produce an effective exploitation on the potential regions obtained by PSO. Moreover, a novel probabilistic selection strategy is proposed to select the appropriate individuals among the current population to undergo local refinement, keeping a well balance between exploration and exploitation. Experimental results confirm that the local refinement with PS and our proposed selection strategy are effective, and finally demonstrate effectiveness and robustness of the proposed PSO-PS based MA for SVMs parameters optimization.

CEJan 9, 2014
Multiple-output support vector regression with a firefly algorithm for interval-valued stock price index forecasting

Tao Xiong, Yukun Bao, Zhongyi Hu

Highly accurate interval forecasting of a stock price index is fundamental to successfully making a profit when making investment decisions, by providing a range of values rather than a point estimate. In this study, we investigate the possibility of forecasting an interval-valued stock price index series over short and long horizons using multi-output support vector regression (MSVR). Furthermore, this study proposes a firefly algorithm (FA)-based approach, built on the established MSVR, for determining the parameters of MSVR (abbreviated as FA-MSVR). Three globally traded broad market indices are used to compare the performance of the proposed FA-MSVR method with selected counterparts. The quantitative and comprehensive assessments are performed on the basis of statistical criteria, economic criteria, and computational cost. In terms of statistical criteria, we compare the out-of-sample forecasting using goodness-of-forecast measures and testing approaches. In terms of economic criteria, we assess the relative forecast performance with a simple trading strategy. The results obtained in this study indicate that the proposed FA-MSVR method is a promising alternative for forecasting interval-valued financial time series.

LGJan 8, 2014
Beyond One-Step-Ahead Forecasting: Evaluation of Alternative Multi-Step-Ahead Forecasting Models for Crude Oil Prices

Tao Xiong, Yukun Bao, Zhongyi Hu

An accurate prediction of crude oil prices over long future horizons is challenging and of great interest to governments, enterprises, and investors. This paper proposes a revised hybrid model built upon empirical mode decomposition (EMD) based on the feed-forward neural network (FNN) modeling framework incorporating the slope-based method (SBM), which is capable of capturing the complex dynamic of crude oil prices. Three commonly used multi-step-ahead prediction strategies proposed in the literature, including iterated strategy, direct strategy, and MIMO (multiple-input multiple-output) strategy, are examined and compared, and practical considerations for the selection of a prediction strategy for multi-step-ahead forecasting relating to crude oil prices are identified. The weekly data from the WTI (West Texas Intermediate) crude oil spot price are used to compare the performance of the alternative models under the EMD-SBM-FNN modeling framework with selected counterparts. The quantitative and comprehensive assessments are performed on the basis of prediction accuracy and computational cost. The results obtained in this study indicate that the proposed EMD-SBM-FNN model using the MIMO strategy is the best in terms of prediction accuracy with accredited computational load.

AIDec 31, 2013
PSO-MISMO Modeling Strategy for Multi-Step-Ahead Time Series Prediction

Yukun Bao, Tao Xiong, Zhongyi Hu

Multi-step-ahead time series prediction is one of the most challenging research topics in the field of time series modeling and prediction, and is continually under research. Recently, the multiple-input several multiple-outputs (MISMO) modeling strategy has been proposed as a promising alternative for multi-step-ahead time series prediction, exhibiting advantages compared with the two currently dominating strategies, the iterated and the direct strategies. Built on the established MISMO strategy, this study proposes a particle swarm optimization (PSO)-based MISMO modeling strategy, which is capable of determining the number of sub-models in a self-adaptive mode, with varying prediction horizons. Rather than deriving crisp divides with equal-size s prediction horizons from the established MISMO, the proposed PSO-MISMO strategy, implemented with neural networks, employs a heuristic to create flexible divides with varying sizes of prediction horizons and to generate corresponding sub-models, providing considerable flexibility in model construction, which has been validated with simulated and real datasets.