LGFeb 27, 2023

Linear pretraining in recurrent mixture density networks

Hubert Normandin-Taillon, Frédéric Godin, Chun Wang

arXiv:2302.14141v12.01 citationsh-index: 38

Originality Synthesis-oriented

AI Analysis

This is an incremental improvement for researchers and practitioners using RMDNs to avoid training instability in time-series modeling.

The paper tackles the problem of bad local minima and NaN issues during training in recurrent mixture density networks (RMDNs) by proposing a linear pretraining method, which improves performance and ensures the RMDN surpasses its linear GARCH counterpart.

We present a method for pretraining a recurrent mixture density network (RMDN). We also propose a slight modification to the architecture of the RMDN-GARCH proposed by Nikolaev et al. [2012]. The pretraining method helps the RMDN avoid bad local minima during training and improves its robustness to the persistent NaN problem, as defined by Guillaumes [2017], which is often encountered with mixture density networks. Such problem consists in frequently obtaining "Not a number" (NaN) values during training. The pretraining method proposed resolves these issues by training the linear nodes in the hidden layer of the RMDN before starting including non-linear node updates. Such an approach improves the performance of the RMDN and ensures it surpasses that of the GARCH model, which is the RMDN's linear counterpart.

View on arXiv PDF

Similar