LG AI CT RA MLFeb 23, 2024

Position: Categorical Deep Learning is an Algebraic Theory of All Architectures

Bruno Gavranović, Paul Lessard, Andrew Dudzik, Tamara von Glehn, João G. M. Araújo, Petar Veličković

arXiv:2402.15332v225.432 citationsh-index: 11ICML

Originality Synthesis-oriented

AI Analysis

This provides a theoretical foundation for unifying diverse neural network designs, though it is incremental in applying existing mathematical tools.

The authors tackle the lack of a general-purpose framework for deep learning architectures by proposing category theory as a unified approach that bridges model constraints and implementations, showing it recovers geometric deep learning constraints and implementations of architectures like RNNs.

We present our position on the elusive quest for a general-purpose framework for specifying and studying deep learning architectures. Our opinion is that the key attempts made so far lack a coherent bridge between specifying constraints which models must satisfy and specifying their implementations. Focusing on building a such a bridge, we propose to apply category theory -- precisely, the universal algebra of monads valued in a 2-category of parametric maps -- as a single theory elegantly subsuming both of these flavours of neural network design. To defend our position, we show how this theory recovers constraints induced by geometric deep learning, as well as implementations of many architectures drawn from the diverse landscape of neural networks, such as RNNs. We also illustrate how the theory naturally encodes many standard constructs in computer science and automata theory.

View on arXiv PDF

Similar