Do Neural Network Weights account for Classes Centers?
This addresses a foundational problem in deep learning for researchers and practitioners by correcting a common empirical assumption that affects training stability, though it is incremental in nature.
The paper tackles the assumption that neural network weight vectors correspond to class centers in feature space, showing it is not always true and leads to training instability. They propose a specific symmetry condition that satisfies the assumption and resolves convergence issues, as demonstrated empirically.
The exploitation of Deep Neural Networks (DNNs) as descriptors in feature learning challenges enjoys apparent popularity over the past few years. The above tendency focuses on the development of effective loss functions that ensure both high feature discrimination among different classes, as well as low geodesic distance between the feature vectors of a given class. The vast majority of the contemporary works rely their formulation on an empirical assumption about the feature space of a network's last hidden layer, claiming that the weight vector of a class accounts for its geometrical center in the studied space. The paper at hand follows a theoretical approach and indicates that the aforementioned hypothesis is not exclusively met. This fact raises stability issues regarding the training procedure of a DNN, as shown in our experimental study. Consequently, a specific symmetry is proposed and studied both analytically and empirically that satisfies the above assumption, addressing the established convergence issues.