Relative Representations: Topological and Geometric Perspectives
This work addresses the challenge of zero-shot model stitching for researchers and practitioners in machine learning, but it is incremental as it builds on established relative representations with specific refinements.
The paper tackled the problem of improving zero-shot model stitching in deep neural networks by proposing two enhancements to relative representations: a normalization procedure for invariance to certain transformations and a topological regularization loss for better class clustering. The result was improved performance on a natural language task, though no specific numbers were provided.
Relative representations are an established approach to zero-shot model stitching, consisting of a non-trainable transformation of the latent space of a deep neural network. Based on insights of topological and geometric nature, we propose two improvements to relative representations. First, we introduce a normalization procedure in the relative transformation, resulting in invariance to non-isotropic rescalings and permutations. The latter coincides with the symmetries in parameter space induced by common activation functions. Second, we propose to deploy topological densification when fine-tuning relative representations, a topological regularization loss encouraging clustering within classes. We provide an empirical investigation on a natural language task, where both the proposed variations yield improved performance on zero-shot model stitching.