CC LG NEApr 4, 2022

Training Fully Connected Neural Networks is $\exists\mathbb{R}$-Complete

Daniel Bertschinger, Christoph Hertrich, Paul Jungeblut, Tillmann Miltzow, Simon Weber

arXiv:2204.01368v314.238 citationsh-index: 19

Originality Highly original

AI Analysis

This establishes fundamental computational hardness for neural network training, impacting researchers and practitioners in machine learning by showing inherent limitations in exact optimization.

The paper proves that training a two-layer fully connected neural network to optimality is ∃ℝ-complete, meaning it is as hard as deciding if a multivariate polynomial has real roots, and shows that algebraic numbers of arbitrarily large degree are required as weights even for rational data.

We consider the problem of finding weights and biases for a two-layer fully connected neural network to fit a given set of data points as well as possible, also known as EmpiricalRiskMinimization. Our main result is that the associated decision problem is $\exists\mathbb{R}$-complete, that is, polynomial-time equivalent to determining whether a multivariate polynomial with integer coefficients has any real roots. Furthermore, we prove that algebraic numbers of arbitrarily large degree are required as weights to be able to train some instances to optimality, even if all data points are rational. Our result already applies to fully connected instances with two inputs, two outputs, and one hidden layer of ReLU neurons. Thereby, we strengthen a result by Abrahamsen, Kleist and Miltzow [NeurIPS 2021]. A consequence of this is that a combinatorial search algorithm like the one by Arora, Basu, Mianjy and Mukherjee [ICLR 2018] is impossible for networks with more than one output dimension, unless $\mathsf{NP}=\exists\mathbb{R}$.

View on arXiv PDF

Similar