Unifying and Merging Well-trained Deep Neural Networks for Inference Stage
This work addresses the need for efficient model deployment on devices with limited resources, though it appears incremental as it builds on existing merging techniques.
The paper tackles the problem of merging two well-trained convolutional neural networks with different architectures into a unified model for efficient inference on resource-limited devices, resulting in a compact model that reduces training overhead and shortens development time.
We propose a novel method to merge convolutional neural-nets for the inference stage. Given two well-trained networks that may have different architectures that handle different tasks, our method aligns the layers of the original networks and merges them into a unified model by sharing the representative codes of weights. The shared weights are further re-trained to fine-tune the performance of the merged model. The proposed method effectively produces a compact model that may run original tasks simultaneously on resource-limited devices. As it preserves the general architectures and leverages the co-used weights of well-trained networks, a substantial training overhead can be reduced to shorten the system development time. Experimental results demonstrate a satisfactory performance and validate the effectiveness of the method.