Grzegorz Dudek

LG
h-index23
30papers
523citations
Novelty44%
AI Score46

30 Papers

MEApr 21, 2022
STD: A Seasonal-Trend-Dispersion Decomposition of Time Series

Grzegorz Dudek

The decomposition of a time series is an essential task that helps to understand its very nature. It facilitates the analysis and forecasting of complex time series expressing various hidden components such as the trend, seasonal components, cyclic components and irregular fluctuations. Therefore, it is crucial in many fields for forecasting and decision processes. In recent years, many methods of time series decomposition have been developed, which extract and reveal different time series properties. Unfortunately, they neglect a very important property, i.e. time series variance. To deal with heteroscedasticity in time series, the method proposed in this work -- a seasonal-trend-dispersion decomposition (STD) -- extracts the trend, seasonal component and component related to the dispersion of the time series. We define STD decomposition in two ways: with and without an irregular component. We show how STD can be used for time series analysis and forecasting.

LGDec 18, 2022
Contextually Enhanced ES-dRNN with Dynamic Attention for Short-Term Load Forecasting

Slawek Smyl, Grzegorz Dudek, Paweł Pełka

In this paper, we propose a new short-term load forecasting (STLF) model based on contextually enhanced hybrid and hierarchical architecture combining exponential smoothing (ES) and a recurrent neural network (RNN). The model is composed of two simultaneously trained tracks: the context track and the main track. The context track introduces additional information to the main track. It is extracted from representative series and dynamically modulated to adjust to the individual series forecasted by the main track. The RNN architecture consists of multiple recurrent layers stacked with hierarchical dilations and equipped with recently proposed attentive dilated recurrent cells. These cells enable the model to capture short-term, long-term and seasonal dependencies across time series as well as to weight dynamically the input information. The model produces both point forecasts and predictive intervals. The experimental part of the work performed on 35 forecasting problems shows that the proposed model outperforms in terms of accuracy its predecessor as well as standard statistical models and state-of-the-art machine learning models.

LGMar 2, 2022
ES-dRNN with Dynamic Attention for Short-Term Load Forecasting

Slawek Smyl, Grzegorz Dudek, Paweł Pełka

Short-term load forecasting (STLF) is a challenging problem due to the complex nature of the time series expressing multiple seasonality and varying variance. This paper proposes an extension of a hybrid forecasting model combining exponential smoothing and dilated recurrent neural network (ES-dRNN) with a mechanism for dynamic attention. We propose a new gated recurrent cell -- attentive dilated recurrent cell, which implements an attention mechanism for dynamic weighting of input vector components. The most relevant components are assigned greater weights, which are subsequently dynamically fine-tuned. This attention mechanism helps the model to select input information and, along with other mechanisms implemented in ES-dRNN, such as adaptive time series processing, cross-learning, and multiple dilation, leads to a significant improvement in accuracy when compared to well-established statistical and state-of-the-art machine learning forecasting models. This was confirmed in the extensive experimental study concerning STLF for 35 European countries.

LGMar 17, 2022
Recurrent Neural Networks for Forecasting Time Series with Multiple Seasonality: A Comparative Study

Grzegorz Dudek, Slawek Smyl, Paweł Pełka

This paper compares recurrent neural networks (RNNs) with different types of gated cells for forecasting time series with multiple seasonality. The cells we compare include classical long short term memory (LSTM), gated recurrent unit (GRU), modified LSTM with dilation, and two new cells we proposed recently, which are equipped with dilation and attention mechanisms. To model the temporal dependencies of different scales, our RNN architecture has multiple dilated recurrent layers stacked with hierarchical dilations. The proposed RNN produces both point forecasts and predictive intervals (PIs) for them. An empirical study concerning short-term electrical load forecasting for 35 European countries confirmed that the new gated cells with dilation and attention performed best.

LGMar 2, 2022
Boosted Ensemble Learning based on Randomized NNs for Time Series Forecasting

Grzegorz Dudek

Time series forecasting is a challenging problem particularly when a time series expresses multiple seasonality, nonlinear trend and varying variance. In this work, to forecast complex time series, we propose ensemble learning which is based on randomized neural networks, and boosted in three ways. These comprise ensemble learning based on residuals, corrected targets and opposed response. The latter two methods are employed to ensure similar forecasting tasks are solved by all ensemble members, which justifies the use of exactly the same base models at all stages of ensembling. Unification of the tasks for all members simplifies ensemble learning and leads to increased forecasting accuracy. This was confirmed in an experimental study involving forecasting time series with triple seasonality, in which we compare our three variants of ensemble boosting. The strong points of the proposed ensembles based on RandNNs are extremely rapid training and pattern-based time series representation, which extracts relevant information from time series.

LGFeb 5
Probabilistic Multi-Regional Solar Power Forecasting with Any-Quantile Recurrent Neural Networks

Slawek Smyl, Paweł Pełka, Grzegorz Dudek

The increasing penetration of photovoltaic (PV) generation introduces significant uncertainty into power system operation, necessitating forecasting approaches that extend beyond deterministic point predictions. This paper proposes an any-quantile probabilistic forecasting framework for multi-regional PV power generation based on the Any-Quantile Recurrent Neural Network (AQ-RNN). The model integrates an any-quantile forecasting paradigm with a dual-track recurrent architecture that jointly processes series-specific and cross-regional contextual information, supported by dilated recurrent cells, patch-based temporal modeling, and a dynamic ensemble mechanism. The proposed framework enables the estimation of calibrated conditional quantiles at arbitrary probability levels within a single trained model and effectively exploits spatial dependencies to enhance robustness at the system level. The approach is evaluated using 30 years of hourly PV generation data from 259 European regions and compared against established statistical and neural probabilistic baselines. The results demonstrate consistent improvements in forecast accuracy, calibration, and prediction interval quality, underscoring the suitability of the proposed method for uncertainty-aware energy management and operational decision-making in renewable-dominated power systems.

LGApr 26, 2024Code
Any-Quantile Probabilistic Forecasting of Short-Term Electricity Demand

Slawek Smyl, Boris N. Oreshkin, Paweł Pełka et al.

Power systems operate under uncertainty originating from multiple factors that are impossible to account for deterministically. Distributional forecasting is used to control and mitigate risks associated with this uncertainty. Recent progress in deep learning has helped to significantly improve the accuracy of point forecasts, while accurate distributional forecasting still presents a significant challenge. In this paper, we propose a novel general approach for distributional forecasting capable of predicting arbitrary quantiles. We show that our general approach can be seamlessly applied to two distinct neural architectures leading to the state-of-the-art distributional forecasting results in the context of short-term electricity demand forecasting task. We empirically validate our method on 35 hourly electricity demand time-series for European countries. Our code is available here: https://github.com/boreshkinai/any-quantile.

LGSep 24, 2020Code
N-BEATS neural network for mid-term electricity load forecasting

Boris N. Oreshkin, Grzegorz Dudek, Paweł Pełka et al.

This paper addresses the mid-term electricity load forecasting problem. Solving this problem is necessary for power system operation and planning as well as for negotiating forward contracts in deregulated energy markets. We show that our proposed deep neural network modeling approach based on the deep neural architecture is effective at solving the mid-term electricity load forecasting problem. Proposed neural network has high expressive power to solve non-linear stochastic forecasting problems with time series including trends, seasonality and significant random fluctuations. At the same time, it is simple to implement and train, it does not require signal preprocessing, and it is equipped with a forecast bias reduction mechanism. We compare our approach against ten baseline methods, including classical statistical methods, machine learning and hybrid approaches, on 35 monthly electricity demand time series for European countries. The empirical study shows that proposed neural network clearly outperforms all competitors in terms of both accuracy and forecast bias. Code is available here: https://github.com/boreshkinai/nbeats-midterm.

LGDec 2, 2024
Enhanced N-BEATS for Mid-Term Electricity Demand Forecasting

Mateusz Kasprzyk, Paweł Pełka, Boris N. Oreshkin et al.

This paper presents an enhanced N-BEATS model, N-BEATS*, for improved mid-term electricity load forecasting (MTLF). Building on the strengths of the original N-BEATS architecture, which excels in handling complex time series data without requiring preprocessing or domain-specific knowledge, N-BEATS* introduces two key modifications. (1) A novel loss function -- combining pinball loss based on MAPE with normalized MSE, the new loss function allows for a more balanced approach by capturing both L1 and L2 loss terms. (2) A modified block architecture -- the internal structure of the N-BEATS blocks is adjusted by introducing a destandardization component to harmonize the processing of different time series, leading to more efficient and less complex forecasting tasks. Evaluated on real-world monthly electricity consumption data from 35 European countries, N-BEATS* demonstrates superior performance compared to its predecessor and other established forecasting methods, including statistical, machine learning, and hybrid models. N-BEATS* achieves the lowest MAPE and RMSE, while also exhibiting the lowest dispersion in forecast errors.

LGApr 11, 2025
Forecasting Cryptocurrency Prices using Contextual ES-adRNN with Exogenous Variables

Slawek Smyl, Grzegorz Dudek, Paweł Pełka

In this paper, we introduce a new approach to multivariate forecasting cryptocurrency prices using a hybrid contextual model combining exponential smoothing (ES) and recurrent neural network (RNN). The model consists of two tracks: the context track and the main track. The context track provides additional information to the main track, extracted from representative series. This information as well as information extracted from exogenous variables is dynamically adjusted to the individual series forecasted by the main track. The RNN stacked architecture with hierarchical dilations, incorporating recently developed attentive dilated recurrent cells, allows the model to capture short and long-term dependencies across time series and dynamically weight input information. The model generates both point daily forecasts and predictive intervals for one-day, one-week and four-week horizons. We apply our model to forecast prices of 15 cryptocurrencies based on 17 input variables and compare its performance with that of comparative models, including both statistical and ML ones.

LGApr 11, 2025
Combining Forecasts using Meta-Learning: A Comparative Study for Complex Seasonality

Grzegorz Dudek

In this paper, we investigate meta-learning for combining forecasts generated by models of different types. While typical approaches for combining forecasts involve simple averaging, machine learning techniques enable more sophisticated methods of combining through meta-learning, leading to improved forecasting accuracy. We use linear regression, $k$-nearest neighbors, multilayer perceptron, random forest, and long short-term memory as meta-learners. We define global and local meta-learning variants for time series with complex seasonality and compare meta-learners on multiple forecasting problems, demonstrating their superior performance compared to simple averaging.

LGJan 30, 2025
HKAN: Hierarchical Kolmogorov-Arnold Network without Backpropagation

Grzegorz Dudek, Tomasz Rodak

This paper introduces the Hierarchical Kolmogorov-Arnold Network (HKAN), a novel network architecture that offers a competitive alternative to the recently proposed Kolmogorov-Arnold Network (KAN). Unlike KAN, which relies on backpropagation, HKAN adopts a randomized learning approach, where the parameters of its basis functions are fixed, and linear aggregations are optimized using least-squares regression. HKAN utilizes a hierarchical multi-stacking framework, with each layer refining the predictions from the previous one by solving a series of linear regression problems. This non-iterative training method simplifies computation and eliminates sensitivity to local minima in the loss function. Empirical results show that HKAN delivers comparable, if not superior, accuracy and stability relative to KAN across various regression tasks, while also providing insights into variable importance. The proposed approach seamlessly integrates theoretical insights with practical applications, presenting a robust and efficient alternative for neural network modeling.

LGNov 25, 2025
Multivariate Forecasting of Bitcoin Volatility with Gradient Boosting: Deterministic, Probabilistic, and Feature Importance Perspectives

Grzegorz Dudek, Mateusz Kasprzyk, Paweł Pełka

This study investigates the application of the Light Gradient Boosting Machine (LGBM) model for both deterministic and probabilistic forecasting of Bitcoin realized volatility. Utilizing a comprehensive set of 69 predictors -- encompassing market, behavioral, and macroeconomic indicators -- we evaluate the performance of LGBM-based models and compare them with both econometric and machine learning baselines. For probabilistic forecasting, we explore two quantile-based approaches: direct quantile regression using the pinball loss function, and a residual simulation method that transforms point forecasts into predictive distributions. To identify the main drivers of volatility, we employ gain-based and permutation feature importance techniques, consistently highlighting the significance of trading volume, lagged volatility measures, investor attention, and market capitalization. The results demonstrate that LGBM models effectively capture the nonlinear and high-variance characteristics of cryptocurrency markets while providing interpretable insights into the underlying volatility dynamics.

STAug 21, 2025
Probabilistic Forecasting Cryptocurrencies Volatility: From Point to Quantile Forecasts

Grzegorz Dudek, Witold Orzeszko, Piotr Fiszeder

Cryptocurrency markets are characterized by extreme volatility, making accurate forecasts essential for effective risk management and informed trading strategies. Traditional deterministic (point) forecasting methods are inadequate for capturing the full spectrum of potential volatility outcomes, underscoring the importance of probabilistic approaches. To address this limitation, this paper introduces probabilistic forecasting methods that leverage point forecasts from a wide range of base models, including statistical (HAR, GARCH, ARFIMA) and machine learning (e.g. LASSO, SVR, MLP, Random Forest, LSTM) algorithms, to estimate conditional quantiles of cryptocurrency realized variance. To the best of our knowledge, this is the first study in the literature to propose and systematically evaluate probabilistic forecasts of variance in cryptocurrency markets based on predictions derived from multiple base models. Our empirical results for Bitcoin demonstrate that the Quantile Estimation through Residual Simulation (QRS) method, particularly when applied to linear base models operating on log-transformed realized volatility data, consistently outperforms more sophisticated alternatives. Additionally, we highlight the robustness of the probabilistic stacking framework, providing comprehensive insights into uncertainty and risk inherent in cryptocurrency volatility forecasting. This research fills a significant gap in the literature, contributing practical probabilistic forecasting methodologies tailored specifically to cryptocurrency markets.

LGJun 15, 2024
Stacking for Probabilistic Short-term Load Forecasting

Grzegorz Dudek

In this study, we delve into the realm of meta-learning to combine point base forecasts for probabilistic short-term electricity demand forecasting. Our approach encompasses the utilization of quantile linear regression, quantile regression forest, and post-processing techniques involving residual simulation to generate quantile forecasts. Furthermore, we introduce both global and local variants of meta-learning. In the local-learning mode, the meta-model is trained using patterns most similar to the query pattern.Through extensive experimental studies across 35 forecasting scenarios and employing 16 base forecasting models, our findings underscored the superiority of quantile regression forest over its competitors

LGDec 5, 2021
ES-dRNN: A Hybrid Exponential Smoothing and Dilated Recurrent Neural Network Model for Short-Term Load Forecasting

Slawek Smyl, Grzegorz Dudek, Paweł Pełka

Short-term load forecasting (STLF) is challenging due to complex time series (TS) which express three seasonal patterns and a nonlinear trend. This paper proposes a novel hybrid hierarchical deep learning model that deals with multiple seasonality and produces both point forecasts and predictive intervals (PIs). It combines exponential smoothing (ES) and a recurrent neural network (RNN). ES extracts dynamically the main components of each individual TS and enables on-the-fly deseasonalization, which is particularly useful when operating on a relatively small data set. A multi-layer RNN is equipped with a new type of dilated recurrent cell designed to efficiently model both short and long-term dependencies in TS. To improve the internal TS representation and thus the model's performance, RNN learns simultaneously both the ES parameters and the main mapping function transforming inputs into forecasts. We compare our approach against several baseline methods, including classical statistical methods and machine learning (ML) approaches, on STLF problems for 35 European countries. The empirical study clearly shows that the proposed model has high expressive power to solve nonlinear stochastic forecasting problems with TS including multiple seasonality and significant random fluctuations. In fact, it outperforms both statistical and state-of-the-art ML models in terms of accuracy.

LGJul 8, 2021
Ensembles of Randomized NNs for Pattern-based Time Series Forecasting

Grzegorz Dudek, Paweł Pełka

In this work, we propose an ensemble forecasting approach based on randomized neural networks. Improved randomized learning streamlines the fitting abilities of individual learners by generating network parameters in accordance with the data and target function features. A pattern-based representation of time series makes the proposed approach suitable for forecasting time series with multiple seasonality. We propose six strategies for controlling the diversity of ensemble members. Case studies conducted on four real-world forecasting problems verified the effectiveness and superior performance of the proposed ensemble forecasting approach. It outperformed statistical models as well as state-of-the-art machine learning models in terms of forecasting accuracy. The proposed approach has several advantages: fast and easy training, simple architecture, ease of implementation, high accuracy and the ability to deal with nonstationarity and multiple seasonality in time series.

LGJul 4, 2021
Autoencoder based Randomized Learning of Feedforward Neural Networks for Regression

Grzegorz Dudek

Feedforward neural networks are widely used as universal predictive models to fit data distribution. Common gradient-based learning, however, suffers from many drawbacks making the training process ineffective and time-consuming. Alternative randomized learning does not use gradients but selects hidden node parameters randomly. This makes the training process extremely fast. However, the problem in randomized learning is how to determine the random parameters. A recently proposed method uses autoencoders for unsupervised parameter learning. This method showed superior performance on classification tasks. In this work, we apply this method to regression problems, and, finding that it has some drawbacks, we show how to improve it. We propose a learning method of autoencoders that controls the produced random weights. We also propose how to determine the biases of hidden nodes. We empirically compare autoencoder based learning with other randomized learning methods proposed recently for regression and find that despite the proposed improvement of the autoencoder based learning, it does not outperform its competitors in fitting accuracy. Moreover, the method is much more complex than its competitors.

LGJul 4, 2021
Randomized Neural Networks for Forecasting Time Series with Multiple Seasonality

Grzegorz Dudek

This work contributes to the development of neural forecasting models with novel randomization-based learning methods. These methods improve the fitting abilities of the neural model, in comparison to the standard method, by generating network parameters in accordance with the data and target function features. A pattern-based representation of time series makes the proposed approach useful for forecasting time series with multiple seasonality. In the simulation study, we evaluate the performance of the proposed models and find that they can compete in terms of forecasting accuracy with fully-trained networks. Extremely fast and easy training, simple architecture, ease of implementation, high accuracy as well as dealing with nonstationarity and multiple seasonality in time series make the proposed model very attractive for a wide range of complex time series forecasting problems.

LGJul 4, 2021
Data-Driven Learning of Feedforward Neural Networks with Different Activation Functions

Grzegorz Dudek

This work contributes to the development of a new data-driven method (D-DM) of feedforward neural networks (FNNs) learning. This method was proposed recently as a way of improving randomized learning of FNNs by adjusting the network parameters to the target function fluctuations. The method employs logistic sigmoid activation functions for hidden nodes. In this study, we introduce other activation functions, such as bipolar sigmoid, sine function, saturating linear functions, reLU, and softplus. We derive formulas for their parameters, i.e. weights and biases. In the simulation study, we evaluate the performance of FNN data-driven learning with different activation functions. The results indicate that the sigmoid activation functions perform much better than others in the approximation of complex, fluctuated target functions.

SPApr 22, 2020
Pattern-based Long Short-term Memory for Mid-term Electrical Load Forecasting

Paweł Pełka, Grzegorz Dudek

This work presents a Long Short-Term Memory (LSTM) network for forecasting a monthly electricity demand time series with a one-year horizon. The novelty of this work is the use of pattern representation of the seasonal time series as an alternative to decomposition. Pattern representation simplifies the complex nonlinear and nonstationary time series, filtering out the trend and equalizing variance. Two types of patterns are defined: x-pattern and y-pattern. The former requires additional forecasting for the coding variables. The latter determines the coding variables from the process history. A hybrid approach based on x-patterns turned out to be more accurate than the standard LSTM approach based on a raw time series. In this combined approach an x-pattern is forecasted using a sequence-to-sequence LSTM network and the coding variables are forecasted using exponential smoothing. A simulation study performed on the monthly electricity demand time series for 35 European countries confirmed the high performance of the proposed model and its competitiveness to classical models such as ARIMA and exponential smoothing as well as the MLP neural network model.

LGMar 29, 2020
Are Direct Links Necessary in RVFL NNs for Regression?

Grzegorz Dudek

A random vector functional link network (RVFL) is widely used as a universal approximator for classification and regression problems. The big advantage of RVFL is fast training without backpropagation. This is because the weights and biases of hidden nodes are selected randomly and stay untrained. Recently, alternative architectures with randomized learning are developed which differ from RVFL in that they have no direct links and a bias term in the output layer. In this study, we investigate the effect of direct links and output node bias on the regression performance of RVFL. For generating random parameters of hidden nodes we use the classical method and two new methods recently proposed in the literature. We test the RVFL performance on several function approximation problems with target functions of different nature: nonlinear, nonlinear with strong fluctuations, nonlinear with linear component and linear. Surprisingly, we found that the direct links and output node bias do not play an important role in improving RVFL accuracy for typical nonlinear regression problems.

LGMar 29, 2020
Ensemble Forecasting of Monthly Electricity Demand using Pattern Similarity-based Methods

Paweł Pełka, Grzegorz Dudek

This work presents ensemble forecasting of monthly electricity demand using pattern similarity-based forecasting methods (PSFMs). PSFMs applied in this study include $k$-nearest neighbor model, fuzzy neighborhood model, kernel regression model, and general regression neural network. An integral part of PSFMs is a time series representation using patterns of time series sequences. Pattern representation ensures the input and output data unification through filtering a trend and equalizing variance. Two types of ensembles are created: heterogeneous and homogeneous. The former consists of different type base models, while the latter consists of a single-type base model. Five strategies are used for controlling a diversity of members in a homogeneous approach. The diversity is generated using different subsets of training data, different subsets of features, randomly disrupted input and output variables, and randomly disrupted model parameters. An empirical illustration applies the ensemble models as well as individual PSFMs for comparison to the monthly electricity demand forecasting for 35 European countries.

SPMar 29, 2020
A Hybrid Residual Dilated LSTM end Exponential Smoothing Model for Mid-Term Electric Load Forecasting

Grzegorz Dudek, Paweł Pełka, Slawek Smyl

This work presents a hybrid and hierarchical deep learning model for mid-term load forecasting. The model combines exponential smoothing (ETS), advanced Long Short-Term Memory (LSTM) and ensembling. ETS extracts dynamically the main components of each individual time series and enables the model to learn their representation. Multi-layer LSTM is equipped with dilated recurrent skip connections and a spatial shortcut path from lower layers to allow the model to better capture long-term seasonal relationships and ensure more efficient training. A common learning procedure for LSTM and ETS, with a penalized pinball loss, leads to simultaneous optimization of data representation and forecasting performance. In addition, ensembling at three levels ensures a powerful regularization. A simulation study performed on the monthly electricity demand time series for 35 European countries confirmed the high performance of the proposed model and its competitiveness with classical models such as ARIMA and ETS as well as state-of-the-art models based on machine learning.

LGMar 3, 2020
Pattern Similarity-based Machine Learning Methods for Mid-term Load Forecasting: A Comparative Study

Grzegorz Dudek, Paweł Pełka

Pattern similarity-based methods are widely used in classification and regression problems. Repeated, similar-shaped cycles observed in seasonal time series encourage to apply these methods for forecasting. In this paper we use the pattern similarity-based methods for forecasting monthly electricity demand expressing annual seasonality. An integral part of the models is the time series representation using patterns of time series sequences. Pattern representation ensures the input and output data unification through trend filtering and variance equalization. Consequently, pattern representation simplifies the forecasting problem and allows us to use models based on pattern similarity. We consider four such models: nearest neighbor model, fuzzy neighborhood model, kernel regression model and general regression neural network. A regression function is constructed by aggregation output patterns with weights dependent on the similarity between input patterns. The advantages of the proposed models are: clear principle of operation, small number of parameters to adjust, fast optimization procedure, good generalization ability, working on the newest data without retraining, robustness to missing input variables, and generating a vector as an output. In the experimental part of the work the proposed models were used to forecasting the monthly demand for 35 European countries. The model performances were compared with the performances of the classical models such as ARIMA and exponential smoothing as well as state-of-the-art models such as multilayer perceptron, neuro-fuzzy system and long short-term memory model. The results show high performance of the proposed models which outperform the comparative models in accuracy, simplicity and ease of optimization.

LGSep 4, 2019
A Constructive Approach for Data-Driven Randomized Learning of Feedforward Neural Networks

Grzegorz Dudek

Feedforward neural networks with random hidden nodes suffer from a problem with the generation of random weights and biases as these are difficult to set optimally to obtain a good projection space. Typically, random parameters are drawn from an interval which is fixed before or adapted during the learning process. Due to the different functions of the weights and biases, selecting them both from the same interval is not a good idea. Recently more sophisticated methods of random parameters generation have been developed, such as the data-driven method proposed in \cite{Anon19}, where the sigmoids are placed in randomly selected regions of the input space and then their slopes are adjusted to the local fluctuations of the target function. In this work, we propose an extended version of this method, which constructs iteratively the network architecture. This method successively generates new hidden nodes and accepts them if the training error decreases significantly. The threshold of acceptance is adapted to the current training stage. At the beginning of the training process only those nodes which lead to the largest error reduction are accepted. Then, the threshold is reduced by half to accept those nodes which model the target function details more accurately. This leads to faster convergence and more compact network architecture, as it includes only "significant" neurons. Several application examples are given which confirm this thesis.

LGAug 16, 2019
Generating Random Parameters in Feedforward Neural Networks with Random Hidden Nodes: Drawbacks of the Standard Method and How to Improve It

Grzegorz Dudek

The standard method of generating random weights and biases in feedforward neural networks with random hidden nodes, selects them both from the uniform distribution over the same fixed interval. In this work, we show the drawbacks of this approach and propose a new method of generating random parameters. This method ensures the most nonlinear fragments of sigmoids, which are most useful in modeling target function nonlinearity, are kept in the input hypercube. In addition, we show how to generate activation functions with uniformly distributed slope angles.

LGAug 15, 2019
Improving Randomized Learning of Feedforward Neural Networks by Appropriate Generation of Random Parameters

Grzegorz Dudek

In this work, a method of random parameters generation for randomized learning of a single-hidden-layer feedforward neural network is proposed. The method firstly, randomly selects the slope angles of the hidden neurons activation functions from an interval adjusted to the target function, then randomly rotates the activation functions, and finally distributes them across the input space. For complex target functions the proposed method gives better results than the approach commonly used in practice, where the random parameters are selected from the fixed interval. This is because it introduces the steepest fragments of the activation functions into the input hypercube, avoiding their saturation fragments.

LGAug 11, 2019
Data-Driven Randomized Learning of Feedforward Neural Networks

Grzegorz Dudek

Randomized methods of neural network learning suffer from a problem with the generation of random parameters as they are difficult to set optimally to obtain a good projection space. The standard method draws the parameters from a fixed interval which is independent of the data scope and activation function type. This does not lead to good results in the approximation of the strongly nonlinear functions. In this work, a method which adjusts the random parameters, representing the slopes and positions of the sigmoids, to the target function features is proposed. The method randomly selects the input space regions, places the sigmoids in these regions and then adjusts the sigmoid slopes to the local fluctuations of the target function. This brings very good results in the approximation of the complex target functions when compared to the standard fixed interval method and other methods recently proposed in the literature.

NEOct 13, 2017
A Method of Generating Random Weights and Biases in Feedforward Neural Networks with Random Hidden Nodes

Grzegorz Dudek

Neural networks with random hidden nodes have gained increasing interest from researchers and practical applications. This is due to their unique features such as very fast training and universal approximation property. In these networks the weights and biases of hidden nodes determining the nonlinear feature mapping are set randomly and are not learned. Appropriate selection of the intervals from which weights and biases are selected is extremely important. This topic has not yet been sufficiently explored in the literature. In this work a method of generating random weights and biases is proposed. This method generates the parameters of the hidden nodes in such a way that nonlinear fragments of the activation functions are located in the input space regions with data and can be used to construct the surface approximating a nonlinear target function. The weights and biases are dependent on the input data range and activation function type. The proposed methods allows us to control the generalization degree of the model. These all lead to improvement in approximation performance of the network. Several experiments show very promising results.