LGAIOct 30, 2024

Understanding Representation of Deep Equilibrium Models from Neural Collapse Perspective

arXiv:2410.23391v24 citationsh-index: 2NIPS
AI Analysis

This provides theoretical insights into implicit neural networks for researchers, though it is incremental as it extends known analysis to a new model type.

The paper tackles the limited theoretical analysis of Deep Equilibrium Models (DEQ) by using Neural Collapse to analyze their representation, showing that DEQ exhibits Neural Collapse under balanced conditions and has advantages like feature convergence to simplex vertices and self-duality in imbalanced settings.

Deep Equilibrium Model (DEQ), which serves as a typical implicit neural network, emphasizes their memory efficiency and competitive performance compared to explicit neural networks. However, there has been relatively limited theoretical analysis on the representation of DEQ. In this paper, we utilize the Neural Collapse ($\mathcal{NC}$) as a tool to systematically analyze the representation of DEQ under both balanced and imbalanced conditions. $\mathcal{NC}$ is an interesting phenomenon in the neural network training process that characterizes the geometry of class features and classifier weights. While extensively studied in traditional explicit neural networks, the $\mathcal{NC}$ phenomenon has not received substantial attention in the context of implicit neural networks. We theoretically show that $\mathcal{NC}$ exists in DEQ under balanced conditions. Moreover, in imbalanced settings, despite the presence of minority collapse, DEQ demonstrated advantages over explicit neural networks. These advantages include the convergence of extracted features to the vertices of a simplex equiangular tight frame and self-duality properties under mild conditions, highlighting DEQ's superiority in handling imbalanced datasets. Finally, we validate our theoretical analyses through experiments in both balanced and imbalanced scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes