LG AI AG MLJun 20, 2025

Identifiability of Deep Polynomial Neural Networks

Konstantin Usevich, Ricardo Borsoi, Clara Dérand, Marianne Clausel

arXiv:2506.17093v213 citationsh-index: 3

Originality Incremental advance

AI Analysis

This work addresses the interpretability challenge for researchers and practitioners using PNNs, though it is incremental as it builds on existing algebraic and geometric structures.

The paper tackles the problem of identifiability in deep Polynomial Neural Networks (PNNs), which is crucial for interpretability, by analyzing how activation degrees and layer widths affect this property. It shows that architectures with non-increasing layer widths are generically identifiable under mild conditions, and encoder-decoder networks are identifiable when decoder widths do not grow too rapidly compared to activation degrees.

Polynomial Neural Networks (PNNs) possess a rich algebraic and geometric structure. However, their identifiability -- a key property for ensuring interpretability -- remains poorly understood. In this work, we present a comprehensive analysis of the identifiability of deep PNNs, including architectures with and without bias terms. Our results reveal an intricate interplay between activation degrees and layer widths in achieving identifiability. As special cases, we show that architectures with non-increasing layer widths are generically identifiable under mild conditions, while encoder-decoder networks are identifiable when the decoder widths do not grow too rapidly compared to the activation degrees. Our proofs are constructive and center on a connection between deep PNNs and low-rank tensor decompositions, and Kruskal-type uniqueness theorems. We also settle an open conjecture on the dimension of PNN's neurovarieties, and provide new bounds on the activation degrees required for it to reach the expected dimension.

View on arXiv PDF

Similar