CV LGJan 31, 2024

Multilinear Operator Networks

Yixin Cheng, Grigorios G. Chrysos, Markos Georgopoulos, Volkan Cevher

arXiv:2401.17992v114.112 citationsh-index: 61Has CodeICLR

Originality Incremental advance

AI Analysis

This work addresses the reliance on activation functions in neural networks, offering a novel approach for researchers in machine learning, though it appears incremental as it builds on polynomial networks.

The authors tackled the problem of eliminating activation functions in deep neural networks by proposing MONet, a model based solely on multilinear operators, which outperforms prior polynomial networks and performs on par with modern architectures on image recognition and scientific computing benchmarks.

Despite the remarkable capabilities of deep neural networks in image recognition, the dependence on activation functions remains a largely unexplored area and has yet to be eliminated. On the other hand, Polynomial Networks is a class of models that does not require activation functions, but have yet to perform on par with modern architectures. In this work, we aim close this gap and propose MONet, which relies solely on multilinear operators. The core layer of MONet, called Mu-Layer, captures multiplicative interactions of the elements of the input token. MONet captures high-degree interactions of the input elements and we demonstrate the efficacy of our approach on a series of image recognition and scientific computing benchmarks. The proposed model outperforms prior polynomial networks and performs on par with modern architectures. We believe that MONet can inspire further research on models that use entirely multilinear operations.

View on arXiv PDF Code

Similar