LGMar 20, 2023

Model Stitching: Looking For Functional Similarity Between Representations

Adriano Hernandez, Rumen Dangovski, Peter Y. Lu, Marin Soljacic

MIT

arXiv:2303.11277v212.311 citationsh-index: 12

Originality Incremental advance

AI Analysis

This work addresses the challenge of understanding functional similarity in neural networks for researchers, though it is incremental as it builds on prior model stitching methods.

The paper tackles the problem of comparing neural network representations across different architectures by extending model stitching to layers with different shapes, revealing that stitching can achieve high accuracy even when layers are far apart in the network.

Model stitching (Lenc & Vedaldi 2015) is a compelling methodology to compare different neural network representations, because it allows us to measure to what degree they may be interchanged. We expand on a previous work from Bansal, Nakkiran & Barak which used model stitching to compare representations of the same shapes learned by differently seeded and/or trained neural networks of the same architecture. Our contribution enables us to compare the representations learned by layers with different shapes from neural networks with different architectures. We subsequently reveal unexpected behavior of model stitching. Namely, we find that stitching, based on convolutions, for small ResNets, can reach high accuracy if those layers come later in the first (sender) network than in the second (receiver), even if those layers are far apart.

View on arXiv PDF

Similar