NALGFeb 23, 2023

Backpropagation through Back Substitution with a Backslash

AmazonMIT
arXiv:2303.15449v21 citationsh-index: 45
Originality Synthesis-oriented
AI Analysis

This work provides an incremental theoretical reformulation of automatic differentiation for researchers in machine learning and numerical computing.

The paper tackles the problem of formulating backpropagation using linear algebra, enabling gradient calculation via Gaussian elimination on triangular systems with operator-valued matrix elements, and demonstrates its implementation in Julia using generic programming without operator overloading.

We present a linear algebra formulation of backpropagation which allows the calculation of gradients by using a generically written ``backslash'' or Gaussian elimination on triangular systems of equations. Generally, the matrix elements are operators. This paper has three contributions: (i) it is of intellectual value to replace traditional treatments of automatic differentiation with a (left acting) operator theoretic, graph-based approach; (ii) operators can be readily placed in matrices in software in programming languages such as Julia as an implementation option; (iii) we introduce a novel notation, ``transpose dot'' operator ``$\{\}^{T_\bullet}$'' that allows for the reversal of operators. We further demonstrate the elegance of the operators approach in a suitable programming language consisting of generic linear algebra operators such as Julia \cite{bezanson2017julia}, and that it is possible to realize this abstraction in code. Our implementation shows how generic linear algebra can allow operators as elements of matrices. In contrast to ``operator overloading,'' where backslash would normally have to be rewritten to take advantage of operators, with ``generic programming'' there is no such need.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes