On Cross-Layer Alignment for Model Fusion of Heterogeneous Neural Networks
This addresses a limitation in model fusion for heterogeneous networks, offering incremental improvements in efficiency and performance for tasks like model compression and knowledge distillation.
The paper tackles the problem of fusing neural networks with different numbers of layers, proposing CLAFusion to enable cross-layer alignment and improve accuracy on datasets like CIFAR10, CIFAR100, and Tiny-ImageNet.
Layer-wise model fusion via optimal transport, named OTFusion, applies soft neuron association for unifying different pre-trained networks to save computational resources. While enjoying its success, OTFusion requires the input networks to have the same number of layers. To address this issue, we propose a novel model fusion framework, named CLAFusion, to fuse neural networks with a different number of layers, which we refer to as heterogeneous neural networks, via cross-layer alignment. The cross-layer alignment problem, which is an unbalanced assignment problem, can be solved efficiently using dynamic programming. Based on the cross-layer alignment, our framework balances the number of layers of neural networks before applying layer-wise model fusion. Our experiments indicate that CLAFusion, with an extra finetuning process, improves the accuracy of residual networks on the CIFAR10, CIFAR100, and Tiny-ImageNet datasets. Furthermore, we explore its practical usage for model compression and knowledge distillation when applying to the teacher-student setting.