LG AIJan 17, 2024

A Characterization Theorem for Equivariant Networks with Point-wise Activations

Marco Pacini, Xiaowen Dong, Bruno Lepri, Gabriele Santin

arXiv:2401.09235v110.45 citationsh-index: 6ICLR

Originality Incremental advance

AI Analysis

This work addresses the limitation of point-wise activations in equivariant networks for specific symmetries, with implications for designing models in domains like graph networks and computer vision.

The paper presents a theorem characterizing all combinations of representations, coordinates, and point-wise activations that yield exactly equivariant layers in neural networks, proving that rotation-equivariant networks can only be invariant for connected compact groups and showing trivial representations in disentangled steerable CNNs.

Equivariant neural networks have shown improved performance, expressiveness and sample complexity on symmetrical domains. But for some specific symmetries, representations, and choice of coordinates, the most common point-wise activations, such as ReLU, are not equivariant, hence they cannot be employed in the design of equivariant neural networks. The theorem we present in this paper describes all possible combinations of finite-dimensional representations, choice of coordinates and point-wise activations to obtain an exactly equivariant layer, generalizing and strengthening existing characterizations. Notable cases of practical relevance are discussed as corollaries. Indeed, we prove that rotation-equivariant networks can only be invariant, as it happens for any network which is equivariant with respect to connected compact groups. Then, we discuss implications of our findings when applied to important instances of exactly equivariant networks. First, we completely characterize permutation equivariant networks such as Invariant Graph Networks with point-wise nonlinearities and their geometric counterparts, highlighting a plethora of models whose expressive power and performance are still unknown. Second, we show that feature spaces of disentangled steerable convolutional neural networks are trivial representations.

View on arXiv PDF

Similar