ML LGSep 6, 2020

Gradient-based Competitive Learning: Theory

Giansalvo Cirrincione, Pietro Barbiero, Gabriele Ciravegna, Vincenzo Randazzo

arXiv:2009.02799v11.4

Originality Incremental advance

AI Analysis

This work addresses unsupervised learning tasks for researchers by introducing a novel theoretical framework that integrates gradient-based and competitive learning, though it appears incremental as it builds on existing techniques.

The paper tackles the problem of unsupervised learning by combining gradient-based methods with competitive learning to better mimic input manifold topology, proving the equivalence of a vanilla competitive layer and its dual trained on transposed data, with the dual layer showing advantages for high-dimensional datasets.

Deep learning has been widely used for supervised learning and classification/regression problems. Recently, a novel area of research has applied this paradigm to unsupervised tasks; indeed, a gradient-based approach extracts, efficiently and autonomously, the relevant features for handling input data. However, state-of-the-art techniques focus mostly on algorithmic efficiency and accuracy rather than mimic the input manifold. On the contrary, competitive learning is a powerful tool for replicating the input distribution topology. This paper introduces a novel perspective in this area by combining these two techniques: unsupervised gradient-based and competitive learning. The theory is based on the intuition that neural networks are able to learn topological structures by working directly on the transpose of the input matrix. At this purpose, the vanilla competitive layer and its dual are presented. The former is just an adaptation of a standard competitive layer for deep clustering, while the latter is trained on the transposed matrix. Their equivalence is extensively proven both theoretically and experimentally. However, the dual layer is better suited for handling very high-dimensional datasets. The proposed approach has a great potential as it can be generalized to a vast selection of topological learning tasks, such as non-stationary and hierarchical clustering; furthermore, it can also be integrated within more complex architectures such as autoencoders and generative adversarial networks.

View on arXiv PDF

Similar