Steffen Goebbels

LGAug 19, 2020

ReLU activated Multi-Layer Neural Networks trained with Mixed Integer Linear Programs

Steffen Goebbels

In this paper, it is demonstrated through a case study that multilayer feedforward neural networks activated by ReLU functions can in principle be trained iteratively with Mixed Integer Linear Programs (MILPs) as follows. Weights are determined with batch learning. Multiple iterations are used per batch of training data. In each iteration, the algorithm starts at the output layer and propagates information back to the first hidden layer to adjust the weights using MILPs or Linear Programs. For each layer, the goal is to minimize the difference between its output and the corresponding target output. The target output of the last (output) layer is equal to the ground truth. The target output of a previous layer is defined as the adjusted input of the following layer. For a given layer, weights are computed by solving a MILP. Then, except for the first hidden layer, the input values are also modified with a MILP to better match the layer outputs to their corresponding target outputs. The method was tested and compared with Tensorflow/Keras (Adam optimizer) using two simple networks on the MNIST dataset containing handwritten digits. Accuracies of the same magnitude as with Tensorflow/Keras were achieved.

FAApr 5, 2020

On Sharpness of Error Bounds for Multivariate Neural Network Approximation

Steffen Goebbels

Single hidden layer feedforward neural networks can represent multivariate functions that are sums of ridge functions. These ridge functions are defined via an activation function and customizable weights. The paper deals with best non-linear approximation by such sums of ridge functions. Error bounds are presented in terms of moduli of smoothness. The main focus, however, is to prove that the bounds are best possible. To this end, counterexamples are constructed with a non-linear, quantitative extension of the uniform boundedness principle. They show sharpness with respect to Lipschitz classes for the logistic activation function and for certain piecewise polynomial activation functions. The paper is based on univariate results in (Goebbels, St.: On sharpness of error bounds for univariate approximation by single hidden layer feedforward neural networks. Results Math 75 (3), 2020, article 109, https://rdcu.be/b5mKH).

Steffen Goebbels

2 Papers