On the Architectural Complexity of Neural Networks

Nicholas J. Cooper, François G. Meyer, Michael L. Roberts, Carlos Zapata-Carratalá, Lijun Chen, Danna Gurari

arXiv:2605.0432555.1h-index: 23Has Code

Predicted impact top 43% in LG · last 90 daysOriginality Incremental advance

AI Analysis

For machine learning researchers, this framework provides a new lens to understand and generate novel neural network architectures, though the practical impact remains to be demonstrated.

The paper introduces a unified theoretical framework for analyzing and constructing deep neural networks by modeling tensor operations, revealing that groundbreaking architectures correlate with increases in architectural complexity. It identifies unexplored high-complexity architectures and releases a dataset of over 3,000 such architectures.

We introduce a unified theoretical framework for the rigorous analysis and systematic construction of deep neural networks (DNNs). This framework addresses a gap in existing theory by explicitly modeling the structure of tensor operations -- lower level information that is often abstracted. Our framework enables two novel objectives: (1) analysis of the evolution of architectural complexity over deep learning history, and (2) automatic construction of novel architectures based on new types of tensor operations. Our study of DNNs introduced over the past 40 years reveals a connection between groundbreaking architectures and increases in different types of architectural complexity. Moreover, we identify several large classes of higher complexity architectures that have not yet been explored. We then collect a dataset of 3,000+ higher complexity architectures, which we publicly release at: https://github.com/combinatoriallabs/ArchitecturalComplexity.

View on arXiv PDF Code

Similar