NANEJan 26, 2016

A network that learns Strassen multiplication

arXiv:1601.07227v111 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient matrix multiplication for computational mathematics, but it is incremental as it applies a known training rule to a specific context.

The paper tackles the problem of discovering Strassen multiplication rules by training neural networks with limited multipliers to find low-rank decompositions of matrix multiplication tensors, achieving rank 7 for M2 with a few thousand examples and rank 23 for M3 with 10^5 examples.

We study neural networks whose only non-linear components are multipliers, to test a new training rule in a context where the precise representation of data is paramount. These networks are challenged to discover the rules of matrix multiplication, given many examples. By limiting the number of multipliers, the network is forced to discover the Strassen multiplication rules. This is the mathematical equivalent of finding low rank decompositions of the $n\times n$ matrix multiplication tensor, $M_n$. We train these networks with the conservative learning rule, which makes minimal changes to the weights so as to give the correct output for each input at the time the input-output pair is received. Conservative learning needs a few thousand examples to find the rank 7 decomposition of $M_2$, and $10^5$ for the rank 23 decomposition of $M_3$ (the lowest known). High precision is critical, especially for $M_3$, to discriminate between true decompositions and "border approximations".

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes