LGAIJun 13, 2024

Separation Power of Equivariant Neural Networks

arXiv:2406.08966v21 citations
Originality Highly original
AI Analysis

This provides theoretical insights into the expressivity of equivariant networks for researchers in machine learning theory.

The paper analyzes the separation power of equivariant neural networks, showing that all non-polynomial activations achieve maximum separation power, depth improves separation power up to a threshold, and block decomposition of hidden representations creates a hierarchy in separation power.

The separation power of a machine learning model refers to its ability to distinguish between different inputs and is often used as a proxy for its expressivity. Indeed, knowing the separation power of a family of models is a necessary condition to obtain fine-grained universality results. In this paper, we analyze the separation power of equivariant neural networks, such as convolutional and permutation-invariant networks. We first present a complete characterization of inputs indistinguishable by models derived by a given architecture. From this results, we derive how separability is influenced by hyperparameters and architectural choices-such as activation functions, depth, hidden layer width, and representation types. Notably, all non-polynomial activations, including ReLU and sigmoid, are equivalent in expressivity and reach maximum separation power. Depth improves separation power up to a threshold, after which further increases have no effect. Adding invariant features to hidden representations does not impact separation power. Finally, block decomposition of hidden representations affects separability, with minimal components forming a hierarchy in separation power that provides a straightforward method for comparing the separation power of models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes