Aleksandr Beknazaryan

ML
h-index4
6papers
16citations
Novelty37%
AI Score25

6 Papers

MLJul 17, 2022
Nonparametric regression with modified ReLU networks

Aleksandr Beknazaryan, Hailin Sang

We consider regression estimation with modified ReLU neural networks in which network weight matrices are first modified by a function $α$ before being multiplied by input vectors. We give an example of continuous, piecewise linear function $α$ for which the empirical risk minimizers over the classes of modified ReLU networks with $l_1$ and squared $l_2$ penalties attain, up to a logarithmic factor, the minimax rate of prediction of unknown $β$-smooth function.

MLAug 17, 2022
Shallow neural network representation of polynomials

Aleksandr Beknazaryan

We show that $d$-variate polynomials of degree $R$ can be represented on $[0,1]^d$ as shallow neural networks of width $2(R+d)^d$. Also, by SNN representation of localized Taylor polynomials of univariate $C^β$-smooth functions, we derive for shallow networks the minimax optimal rate of convergence, up to a logarithmic factor, to unknown univariate regression function.

MLApr 27, 2025
The Double Descent Behavior in Two Layer Neural Network for Binary Classification

Chathurika S Abeykoon, Aleksandr Beknazaryan, Hailin Sang

Recent studies observed a surprising concept on model test error called the double descent phenomenon, where the increasing model complexity decreases the test error first and then the error increases and decreases again. To observe this, we work on a two layer neural network model with a ReLU activation function designed for binary classification under supervised learning. Our aim is to observe and investigate the mathematical theory behind the double descent behavior of model test error for varying model sizes. We quantify the model size by the ratio of number of training samples to the dimension of the model. Due to the complexity of the empirical risk minimization procedure, we use the Convex Gaussian Min Max Theorem to find a suitable candidate for the global training loss.

MLMay 20, 2021
Neural networks with superexpressive activations and integer weights

Aleksandr Beknazaryan

An example of an activation function $σ$ is given such that networks with activations $\{σ, \lfloor\cdot\rfloor\}$, integer weights and a fixed architecture depending on $d$ approximate continuous functions on $[0,1]^d$. The range of integer weights required for $\varepsilon$-approximation of Hölder continuous functions is derived, which leads to a convergence rate of order $n^{\frac{-2β}{2β+d}}\log_2n$ for neural network regression estimation of unknown $β$-Hölder continuous function with given $n$ samples.

MLApr 5, 2021
Analytic function approximation by path norm regularized deep networks

Aleksandr Beknazaryan

We show that neural networks with absolute value activation function and with the path norm, the depth, the width and the network weights having logarithmic dependence on $1/\varepsilon$ can $\varepsilon$-approximate functions that are analytic on certain regions of $\mathbb{C}^d$.

MLMar 15, 2021
Function approximation by deep neural networks with parameters $\{0,\pm \frac{1}{2}, \pm 1, 2\}$

Aleksandr Beknazaryan

In this paper it is shown that $C_β$-smooth functions can be approximated by deep neural networks with ReLU activation function and with parameters $\{0,\pm \frac{1}{2}, \pm 1, 2\}$. The $l_0$ and $l_1$ parameter norms of considered networks are thus equivalent. The depth, width and the number of active parameters of the constructed networks have, up to a logarithmic factor, the same dependence on the approximation error as the networks with parameters in $[-1,1]$. In particular, this means that the nonparametric regression estimation with the constructed networks attains the same convergence rate as with sparse networks with parameters in $[-1,1]$.