Yukun Bao

h-index3

14papers

1,197citations

Novelty45%

AI Score41

Ranked #94,409 of 201,326 authors (top 47%)#21,116 in LG (top 50%)

14 Papers

LGFeb 12

Temporally Unified Adversarial Perturbations for Time Series Forecasting

Ruixian Su, Yukun Bao, Xinze Zhang

While deep learning models have achieved remarkable success in time series forecasting, their vulnerability to adversarial examples remains a critical security concern. However, existing attack methods in the forecasting field typically ignore the temporal consistency inherent in time series data, leading to divergent and contradictory perturbation values for the same timestamp across overlapping samples. This temporally inconsistent perturbations problem renders adversarial attacks impractical for real-world data manipulation. To address this, we introduce Temporally Unified Adversarial Perturbations (TUAPs), which enforce a temporal unification constraint to ensure identical perturbations for each timestamp across all overlapping samples. Moreover, we propose a novel Timestamp-wise Gradient Accumulation Method (TGAM) that provides a modular and efficient approach to effectively generate TUAPs by aggregating local gradient information from overlapping samples. By integrating TGAM with momentum-based attack algorithms, we ensure strict temporal consistency while fully utilizing series-level gradient information to explore the adversarial perturbation space. Comprehensive experiments on three benchmark datasets and four representative state-of-the-art models demonstrate that our proposed method significantly outperforms baselines in both white-box and black-box transfer attack scenarios under TUAP constraints. Moreover, our method also exhibits superior transfer attack performance even without TUAP constraints, demonstrating its effectiveness and superiority in generating adversarial perturbations for time series forecasting models.

LGJun 14, 2024

A Policy Gradient-Based Sequence-to-Sequence Method for Time Series Prediction

Qi Sima, Xinze Zhang, Yukun Bao et al.

Sequence-to-sequence architectures built upon recurrent neural networks have become a standard choice for multi-step-ahead time series prediction. In these models, the decoder produces future values conditioned on contextual inputs, typically either actual historical observations (ground truth) or previously generated predictions. During training, feeding ground-truth values helps stabilize learning but creates a mismatch between training and inference conditions, known as exposure bias, since such true values are inaccessible during real-world deployment. On the other hand, using the model's own outputs as inputs at test time often causes errors to compound rapidly across prediction steps. To mitigate these limitations, we introduce a new training paradigm grounded in reinforcement learning: a policy gradient-based method to learn an adaptive input selection strategy for sequence-to-sequence prediction models. Auxiliary models first synthesize plausible input candidates for the decoder, and a trainable policy network optimized via policy gradients dynamically chooses the most beneficial inputs to maximize long-term prediction performance. Empirical evaluations on diverse time series datasets confirm that our approach enhances both accuracy and stability in multi-step forecasting compared to conventional methods.

LGOct 28, 2021

Multivariate Empirical Mode Decomposition based Hybrid Model for Day-ahead Peak Load Forecasting

Yanmei Huang, Najmul Hasan, Changrui Deng et al.

Accurate day-ahead peak load forecasting is crucial not only for power dispatching but also has a great interest to investors and energy policy maker as well as government. Literature reveals that 1% error drop of forecast can reduce 10 million pounds operational cost. Thus, this study proposed a novel hybrid predictive model built upon multivariate empirical mode decomposition (MEMD) and support vector regression (SVR) with parameters optimized by particle swarm optimization (PSO), which is able to capture precise electricity peak load. The novelty of this study mainly comes from the application of MEMD, which enables the multivariate data decomposition to effectively extract inherent information among relevant variables at different time frequency during the deterioration of multivariate over time. Two real-world load data sets from the New South Wales (NSW) and the Victoria (VIC) in Australia have been considered to verify the superiority of the proposed MEMD-PSO-SVR hybrid model. The quantitative and comprehensive assessments are performed, and the results indicate that the proposed MEMD-PSO-SVR method is a promising alternative for day-ahead electricity peak load forecasting.

LGOct 27, 2021

Comprehensive learning particle swarm optimization enabled modeling framework for multi-step-ahead influenza prediction

Siyue Yang, Yukun Bao

Epidemics of influenza are major public health concerns. Since influenza prediction always relies on the weekly clinical or laboratory surveillance data, typically the weekly Influenza-like illness (ILI) rate series, accurate multi-step-ahead influenza predictions using ILI series is of great importance, especially, to the potential coming influenza outbreaks. This study proposes Comprehensive Learning Particle Swarm Optimization based Machine Learning (CLPSO-ML) framework incorporating support vector regression (SVR) and multilayer perceptron (MLP) for multi-step-ahead influenza prediction. A comprehensive examination and comparison of the performance and potential of three commonly used multi-step-ahead prediction modeling strategies, including iterated strategy, direct strategy and multiple-input multiple-output (MIMO) strategy, was conducted using the weekly ILI rate series from both the Southern and Northern China. The results show that: (1) The MIMO strategy achieves the best multi-step-ahead prediction, and is potentially more adaptive for longer horizon; (2) The iterated strategy demonstrates special potentials for deriving the least time difference between the occurrence of the predicted peak value and the true peak value of an influenza outbreak; (3) For ILI in the Northern China, SVR model implemented with MIMO strategy performs best, and SVR with iterated strategy also shows remarkable performance especially during outbreak periods; while for ILI in the Southern China, both SVR and MLP models with MIMO strategy have competitive prediction performance

LGFeb 3, 2020

Error-feedback stochastic modeling strategy for time series forecasting with convolutional neural networks

Xinze Zhang, Kun He, Yukun Bao

Despite the superiority of convolutional neural networks demonstrated in time series modeling and forecasting, it has not been fully explored on the design of the neural network architecture and the tuning of the hyper-parameters. Inspired by the incremental construction strategy for building a random multilayer perceptron, we propose a novel Error-feedback Stochastic Modeling (ESM) strategy to construct a random Convolutional Neural Network (ESM-CNN) for time series forecasting task, which builds the network architecture adaptively. The ESM strategy suggests that random filters and neurons of the error-feedback fully connected layer are incrementally added to steadily compensate the prediction error during the construction process, and then a filter selection strategy is introduced to enable ESM-CNN to extract the different size of temporal features, providing helpful information at each iterative process for the prediction. The performance of ESM-CNN is justified on its prediction accuracy of one-step-ahead and multi-step-ahead forecasting tasks respectively. Comprehensive experiments on both the synthetic and real-world datasets show that the proposed ESM-CNN not only outperforms the state-of-art random neural networks, but also exhibits stronger predictive power and less computing overhead in comparison to trained state-of-art deep neural network models.

CRFeb 23, 2019

Identifying Malicious Web Domains Using Machine Learning Techniques with Online Credibility and Performance Data

Zhongyi Hu, Raymond Chiong, Ilung Pranata et al.

Malicious web domains represent a big threat to web users' privacy and security. With so much freely available data on the Internet about web domains' popularity and performance, this study investigated the performance of well-known machine learning techniques used in conjunction with this type of online data to identify malicious web domains. Two datasets consisting of malware and phishing domains were collected to build and evaluate the machine learning classifiers. Five single classifiers and four ensemble classifiers were applied to distinguish malicious domains from benign ones. In addition, a binary particle swarm optimisation (BPSO) based feature selection method was used to improve the performance of single classifiers. Experimental results show that, based on the web domains' popularity and performance data features, the examined machine learning techniques can accurately identify malicious domains in different ways. Furthermore, the BPSO-based feature selection procedure is shown to be an effective way to improve the performance of classifiers.

LGOct 19, 2018

Malicious Web Domain Identification using Online Credibility and Performance Data by Considering the Class Imbalance Issue

Zhongyi Hu, Raymond Chiong, Ilung Pranata et al.

Purpose: Malicious web domain identification is of significant importance to the security protection of Internet users. With online credibility and performance data, this paper aims to investigate the use of machine learning tech-niques for malicious web domain identification by considering the class imbalance issue (i.e., there are more benign web domains than malicious ones). Design/methodology/approach: We propose an integrated resampling approach to handle class imbalance by combining the Synthetic Minority Over-sampling TEchnique (SMOTE) and Particle Swarm Optimisation (PSO), a population-based meta-heuristic algorithm. We use the SMOTE for over-sampling and PSO for under-sampling. Findings: By applying eight well-known machine learning classifiers, the proposed integrated resampling approach is comprehensively examined using several imbalanced web domain datasets with different imbalance ratios. Com-pared to five other well-known resampling approaches, experimental results confirm that the proposed approach is highly effective. Practical implications: This study not only inspires the practical use of online credibility and performance data for identifying malicious web domains, but also provides an effective resampling approach for handling the class imbal-ance issue in the area of malicious web domain identification. Originality/value: Online credibility and performance data is applied to build malicious web domain identification models using machine learning techniques. An integrated resampling approach is proposed to address the class im-balance issue. The performance of the proposed approach is confirmed based on real-world datasets with different imbalance ratios.

LGJun 15, 2014

Interval Forecasting of Electricity Demand: A Novel Bivariate EMD-based Support Vector Regression Modeling Framework

Tao Xiong, Yukun Bao, Zhongyi Hu

Highly accurate interval forecasting of electricity demand is fundamental to the success of reducing the risk when making power system planning and operational decisions by providing a range rather than point estimation. In this study, a novel modeling framework integrating bivariate empirical mode decomposition (BEMD) and support vector regression (SVR), extended from the well-established empirical mode decomposition (EMD) based time series modeling framework in the energy demand forecasting literature, is proposed for interval forecasting of electricity demand. The novelty of this study arises from the employment of BEMD, a new extension of classical empirical model decomposition (EMD) destined to handle bivariate time series treated as complex-valued time series, as decomposition method instead of classical EMD only capable of decomposing one-dimensional single-valued time series. This proposed modeling framework is endowed with BEMD to decompose simultaneously both the lower and upper bounds time series, constructed in forms of complex-valued time series, of electricity demand on a monthly per hour basis, resulting in capturing the potential interrelationship between lower and upper bounds. The proposed modeling framework is justified with monthly interval-valued electricity demand data per hour in Pennsylvania-New Jersey-Maryland Interconnection, indicating it as a promising method for interval-valued electricity demand forecasting.

LGJan 11, 2014

Multi-Step-Ahead Time Series Prediction using Multiple-Output Support Vector Regression

Yukun Bao, Tao Xiong, Zhongyi Hu

Accurate time series prediction over long future horizons is challenging and of great interest to both practitioners and academics. As a well-known intelligent algorithm, the standard formulation of Support Vector Regression (SVR) could be taken for multi-step-ahead time series prediction, only relying either on iterated strategy or direct strategy. This study proposes a novel multiple-step-ahead time series prediction approach which employs multiple-output support vector regression (M-SVR) with multiple-input multiple-output (MIMO) prediction strategy. In addition, the rank of three leading prediction strategies with SVR is comparatively examined, providing practical implications on the selection of the prediction strategy for multi-step-ahead forecasting while taking SVR as modeling technique. The proposed approach is validated with the simulated and real datasets. The quantitative and comprehensive assessments are performed on the basis of the prediction accuracy and computational cost. The results indicate that: 1) the M-SVR using MIMO strategy achieves the best accurate forecasts with accredited computational load, 2) the standard SVR using direct strategy achieves the second best accurate forecasts, but with the most expensive computational cost, and 3) the standard SVR using iterated strategy is the worst in terms of prediction accuracy, but with the least computational cost.

AIJan 11, 2014

Does Restraining End Effect Matter in EMD-Based Modeling Framework for Time Series Prediction? Some Experimental Evidences

Tao Xiong, Yukun Bao, Zhongyi Hu

Following the "decomposition-and-ensemble" principle, the empirical mode decomposition (EMD)-based modeling framework has been widely used as a promising alternative for nonlinear and nonstationary time series modeling and prediction. The end effect, which occurs during the sifting process of EMD and is apt to distort the decomposed sub-series and hurt the modeling process followed, however, has been ignored in previous studies. Addressing the end effect issue, this study proposes to incorporate end condition methods into EMD-based decomposition and ensemble modeling framework for one- and multi-step ahead time series prediction. Four well-established end condition methods, Mirror method, Coughlin's method, Slope-based method, and Rato's method, are selected, and support vector regression (SVR) is employed as the modeling technique. For the purpose of justification and comparison, well-known NN3 competition data sets are used and four well-established prediction models are selected as benchmarks. The experimental results demonstrated that significant improvement can be achieved by the proposed EMD-based SVR models with end condition methods. The EMD-SBM-SVR model and EMD-Rato-SVR model, in particular, achieved the best prediction performances in terms of goodness of forecast measures and equality of accuracy of competing forecasts test.

LGJan 9, 2014

A PSO and Pattern Search based Memetic Algorithm for SVMs Parameters Optimization

Yukun Bao, Zhongyi Hu, Tao Xiong

Addressing the issue of SVMs parameters optimization, this study proposes an efficient memetic algorithm based on Particle Swarm Optimization algorithm (PSO) and Pattern Search (PS). In the proposed memetic algorithm, PSO is responsible for exploration of the search space and the detection of the potential regions with optimum solutions, while pattern search (PS) is used to produce an effective exploitation on the potential regions obtained by PSO. Moreover, a novel probabilistic selection strategy is proposed to select the appropriate individuals among the current population to undergo local refinement, keeping a well balance between exploration and exploitation. Experimental results confirm that the local refinement with PS and our proposed selection strategy are effective, and finally demonstrate effectiveness and robustness of the proposed PSO-PS based MA for SVMs parameters optimization.

CEJan 9, 2014

Multiple-output support vector regression with a firefly algorithm for interval-valued stock price index forecasting

Tao Xiong, Yukun Bao, Zhongyi Hu

Highly accurate interval forecasting of a stock price index is fundamental to successfully making a profit when making investment decisions, by providing a range of values rather than a point estimate. In this study, we investigate the possibility of forecasting an interval-valued stock price index series over short and long horizons using multi-output support vector regression (MSVR). Furthermore, this study proposes a firefly algorithm (FA)-based approach, built on the established MSVR, for determining the parameters of MSVR (abbreviated as FA-MSVR). Three globally traded broad market indices are used to compare the performance of the proposed FA-MSVR method with selected counterparts. The quantitative and comprehensive assessments are performed on the basis of statistical criteria, economic criteria, and computational cost. In terms of statistical criteria, we compare the out-of-sample forecasting using goodness-of-forecast measures and testing approaches. In terms of economic criteria, we assess the relative forecast performance with a simple trading strategy. The results obtained in this study indicate that the proposed FA-MSVR method is a promising alternative for forecasting interval-valued financial time series.

LGJan 8, 2014

Beyond One-Step-Ahead Forecasting: Evaluation of Alternative Multi-Step-Ahead Forecasting Models for Crude Oil Prices

Tao Xiong, Yukun Bao, Zhongyi Hu

An accurate prediction of crude oil prices over long future horizons is challenging and of great interest to governments, enterprises, and investors. This paper proposes a revised hybrid model built upon empirical mode decomposition (EMD) based on the feed-forward neural network (FNN) modeling framework incorporating the slope-based method (SBM), which is capable of capturing the complex dynamic of crude oil prices. Three commonly used multi-step-ahead prediction strategies proposed in the literature, including iterated strategy, direct strategy, and MIMO (multiple-input multiple-output) strategy, are examined and compared, and practical considerations for the selection of a prediction strategy for multi-step-ahead forecasting relating to crude oil prices are identified. The weekly data from the WTI (West Texas Intermediate) crude oil spot price are used to compare the performance of the alternative models under the EMD-SBM-FNN modeling framework with selected counterparts. The quantitative and comprehensive assessments are performed on the basis of prediction accuracy and computational cost. The results obtained in this study indicate that the proposed EMD-SBM-FNN model using the MIMO strategy is the best in terms of prediction accuracy with accredited computational load.

AIDec 31, 2013

PSO-MISMO Modeling Strategy for Multi-Step-Ahead Time Series Prediction

Yukun Bao, Tao Xiong, Zhongyi Hu

Multi-step-ahead time series prediction is one of the most challenging research topics in the field of time series modeling and prediction, and is continually under research. Recently, the multiple-input several multiple-outputs (MISMO) modeling strategy has been proposed as a promising alternative for multi-step-ahead time series prediction, exhibiting advantages compared with the two currently dominating strategies, the iterated and the direct strategies. Built on the established MISMO strategy, this study proposes a particle swarm optimization (PSO)-based MISMO modeling strategy, which is capable of determining the number of sub-models in a self-adaptive mode, with varying prediction horizons. Rather than deriving crisp divides with equal-size s prediction horizons from the established MISMO, the proposed PSO-MISMO strategy, implemented with neural networks, employs a heuristic to create flexible divides with varying sizes of prediction horizons and to generate corresponding sub-models, providing considerable flexibility in model construction, which has been validated with simulated and real datasets.