CTLGAug 1, 2022

Neural network layers as parametric spans

arXiv:2208.00809v23 citationsh-index: 11
Originality Highly original
AI Analysis

This work provides a foundational mathematical framework for neural network layers, potentially benefiting researchers in machine learning theory by offering a unified approach to layer definition and differentiation.

The authors tackled the challenge of mathematically defining increasingly complex neural network layers by introducing a general definition of linear layers based on categorical frameworks, integration theory, and parametric spans, which generalizes classical layers and ensures derivative computability for backpropagation.

Properties such as composability and automatic differentiation made artificial neural networks a pervasive tool in applications. Tackling more challenging problems caused neural networks to progressively become more complex and thus difficult to define from a mathematical perspective. We present a general definition of linear layer arising from a categorical framework based on the notions of integration theory and parametric spans. This definition generalizes and encompasses classical layers (e.g., dense, convolutional), while guaranteeing existence and computability of the layer's derivatives for backpropagation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes