COAIITCVMLJun 11, 2019

Neural network identifiability for a family of sigmoidal nonlinearities

arXiv:1906.06994v323 citations
Originality Incremental advance
AI Analysis

This addresses a foundational theoretical problem in machine learning for researchers, providing general identifiability results beyond prior limited cases.

The paper tackles the problem of neural network identifiability, determining if the input-output map uniquely specifies the architecture and parameters, and derives necessary genericity conditions for networks of arbitrary depth and connectivity, constructing a family of nonlinearities where these conditions are minimal and can approximate common ones.

This paper addresses the following question of neural network identifiability: Does the input-output map realized by a feed-forward neural network with respect to a given nonlinearity uniquely specify the network architecture, weights, and biases? Existing literature on the subject Sussman 1992, Albertini, Sontag et al. 1993, Fefferman 1994 suggests that the answer should be yes, up to certain symmetries induced by the nonlinearity, and provided the networks under consideration satisfy certain "genericity conditions". The results in Sussman 1992 and Albertini, Sontag et al. 1993 apply to networks with a single hidden layer and in Fefferman 1994 the networks need to be fully connected. In an effort to answer the identifiability question in greater generality, we derive necessary genericity conditions for the identifiability of neural networks of arbitrary depth and connectivity with an arbitrary nonlinearity. Moreover, we construct a family of nonlinearities for which these genericity conditions are minimal, i.e., both necessary and sufficient. This family is large enough to approximate many commonly encountered nonlinearities to within arbitrary precision in the uniform norm.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes