CV LG IVAug 8, 2020

Using UNet and PSPNet to explore the reusability principle of CNN parameters

arXiv:2008.03414v11.2

Originality Incremental advance

AI Analysis

This work addresses the problem of reducing training data requirements for deep learning practitioners by clarifying the reasons behind parameter reusability, though it is incremental as it builds on existing transfer and semi-supervised learning methods.

The paper experimentally quantifies the reusability of parameters in deep convolutional neural networks by using UNet and PSPNet for segmentation and auto-encoder tasks, proving that parameters can be reused due to general network features and minimal differences from ideal parameters, with specific observations on sensitivity in BN and convolutional layers.

How to reduce the requirement on training dataset size is a hot topic in deep learning community. One straightforward way is to reuse some pre-trained parameters. Some previous work like Deep transfer learning reuse the model parameters trained for the first task as the starting point for the second task, and semi-supervised learning is trained upon a combination of labeled and unlabeled data. However, the fundamental reason of the success of these methods is unclear. In this paper, the reusability of parameters in each layer of a deep convolutional neural network is experimentally quantified by using a network to do segmentation and auto-encoder task. This paper proves that network parameters can be reused for two reasons: first, the network features are general; Second, there is little difference between the pre-trained parameters and the ideal network parameters. Through the use of parameter replacement and comparison, we demonstrate that reusability is different in BN(Batch Normalization)[7] layer and Convolution layer and some observations: (1)Running mean and running variance plays an important role than Weight and Bias in BN layer.(2)The weight and bias can be reused in BN layers.( 3) The network is very sensitive to the weight of convolutional layer.(4) The bias in Convolution layers are not sensitive, and it can be reused directly.

View on arXiv PDF

Similar