Low Dimensional Invariant Embeddings for Universal Geometric Learning
This work provides a more efficient theoretical framework for invariant learning in machine learning, though it is incremental as it builds on existing separating invariant methods.
The paper addresses the problem of constructing separating invariants for geometric learning, showing that only 2D+1 invariants are needed for full separation and D+1 for generic separation, reducing the cardinality from previously larger proposals. This results in more efficient universal constructions for equivariant neural networks, with applications to group actions like permutations and rotations.
This paper studies separating invariants: mappings on $D$ dimensional domains which are invariant to an appropriate group action, and which separate orbits. The motivation for this study comes from the usefulness of separating invariants in proving universality of equivariant neural network architectures. We observe that in several cases the cardinality of separating invariants proposed in the machine learning literature is much larger than the dimension $D$. As a result, the theoretical universal constructions based on these separating invariants is unrealistically large. Our goal in this paper is to resolve this issue. We show that when a continuous family of semi-algebraic separating invariants is available, separation can be obtained by randomly selecting $2D+1 $ of these invariants. We apply this methodology to obtain an efficient scheme for computing separating invariants for several classical group actions which have been studied in the invariant learning literature. Examples include matrix multiplication actions on point clouds by permutations, rotations, and various other linear groups. Often the requirement of invariant separation is relaxed and only generic separation is required. In this case, we show that only $D+1$ invariants are required. More importantly, generic invariants are often significantly easier to compute, as we illustrate by discussing generic and full separation for weighted graphs. Finally we outline an approach for proving that separating invariants can be constructed also when the random parameters have finite precision.