Set Norm and Equivariant Skip Connections: Putting the Deep in Deep Sets
This work addresses gradient and normalization problems in permutation-invariant models, which is important for researchers and practitioners in machine learning dealing with set-based data, though it is incremental as it builds on existing architectures.
The paper tackled the issues of vanishing/exploding gradients and performance degradation in deep permutation-invariant neural networks by introducing set norm and equivariant skip connections, resulting in Deep Sets++ and Set Transformer++ models that achieve comparable or better performance on diverse tasks.
Permutation invariant neural networks are a promising tool for making predictions from sets. However, we show that existing permutation invariant architectures, Deep Sets and Set Transformer, can suffer from vanishing or exploding gradients when they are deep. Additionally, layer norm, the normalization of choice in Set Transformer, can hurt performance by removing information useful for prediction. To address these issues, we introduce the clean path principle for equivariant residual connections and develop set norm, a normalization tailored for sets. With these, we build Deep Sets++ and Set Transformer++, models that reach high depths with comparable or better performance than their original counterparts on a diverse suite of tasks. We additionally introduce Flow-RBC, a new single-cell dataset and real-world application of permutation invariant prediction. We open-source our data and code here: https://github.com/rajesh-lab/deep_permutation_invariant.