Using linear initialisation to improve speed of convergence and fully-trained error in Autoencoders
This work addresses training efficiency and performance for neural network practitioners, but it is incremental as it focuses on improving an existing component (weight initialization) rather than a fundamental breakthrough.
The paper tackles the problem of slow convergence and suboptimal final error in autoencoders by introducing a novel weight initialization technique called the Straddled Matrix Initialiser, which outperforms seven other state-of-the-art methods across three datasets.
Good weight initialisation is an important step in successful training of Artificial Neural Networks. Over time a number of improvements have been proposed to this process. In this paper we introduce a novel weight initialisation technique called the Straddled Matrix Initialiser. This initialisation technique is motivated by our assumption that major, global-scale relationships in data are linear with only smaller effects requiring complex non-linearities. Combination of Straddled Matrix and ReLU activation function initialises a Neural Network as a de facto linear model, which we postulate should be a better starting point for optimisation given our assumptions. We test this by training autoencoders on three datasets using Straddled Matrix and seven other state-of-the-art weight initialisation techniques. In all our experiments the Straddeled Matrix Initialiser clearly outperforms all other methods.