LG CV MLMar 28, 2018

Intertwiners between Induced Representations (with Applications to the Theory of Equivariant Neural Networks)

Taco S. Cohen, Mario Geiger, Maurice Weiler

arXiv:1803.10743v214.749 citations

Originality Highly original

AI Analysis

This provides a foundational mathematical framework for equivariant neural networks, impacting machine learning for symmetric data like images and video.

The paper tackles the problem of designing group equivariant convolutional neural networks (G-CNNs) for data with symmetries, showing that layers are convolutional if and only if feature spaces transform according to induced representations, establishing G-CNNs as a universal class of equivariant architectures.

Group equivariant and steerable convolutional neural networks (regular and steerable G-CNNs) have recently emerged as a very effective model class for learning from signal data such as 2D and 3D images, video, and other data where symmetries are present. In geometrical terms, regular G-CNNs represent data in terms of scalar fields ("feature channels"), whereas the steerable G-CNN can also use vector or tensor fields ("capsules") to represent data. In algebraic terms, the feature spaces in regular G-CNNs transform according to a regular representation of the group G, whereas the feature spaces in Steerable G-CNNs transform according to the more general induced representations of G. In order to make the network equivariant, each layer in a G-CNN is required to intertwine between the induced representations associated with its input and output space. In this paper we present a general mathematical framework for G-CNNs on homogeneous spaces like Euclidean space or the sphere. We show, using elementary methods, that the layers of an equivariant network are convolutional if and only if the input and output feature spaces transform according to an induced representation. This result, which follows from G.W. Mackey's abstract theory on induced representations, establishes G-CNNs as a universal class of equivariant network architectures, and generalizes the important recent work of Kondor & Trivedi on the intertwiners between regular representations.

View on arXiv PDF

Similar