LG CVFeb 15, 2023

Self-Organising Neural Discrete Representation Learning à la Kohonen

Kazuki Irie, Róbert Csordás, Jürgen Schmidhuber

arXiv:2302.07950v23.82 citationsh-index: 100Has Code

Originality Synthesis-oriented

AI Analysis

This work addresses the robustness and efficiency of discrete representation learning in generative models, but it is incremental as it revisits an older algorithm (KSOM) in a modern context.

The paper tackles the problem of unsupervised learning of discrete representations in neural networks by studying Kohonen's Self-Organising Map (KSOM) as an alternative to the commonly used EMA-VQ algorithm in VQ-VAEs for image processing, finding that KSOM is more robust to initialization schemes and shows a speed-up only at the beginning of training.

Unsupervised learning of discrete representations in neural networks (NNs) from continuous ones is essential for many modern applications. Vector Quantisation (VQ) has become popular for this, in particular in the context of generative models, such as Variational Auto-Encoders (VAEs), where the exponential moving average-based VQ (EMA-VQ) algorithm is often used. Here, we study an alternative VQ algorithm based on Kohonen's learning rule for the Self-Organising Map (KSOM; 1982). EMA-VQ is a special case of KSOM. KSOM is known to offer two potential benefits: empirically, it converges faster than EMA-VQ, and KSOM-generated discrete representations form a topological structure on the grid whose nodes are the discrete symbols, resulting in an artificial version of the brain's topographic map. We revisit these properties by using KSOM in VQ-VAEs for image processing. In our experiments, the speed-up compared to well-configured EMA-VQ is only observable at the beginning of training, but KSOM is generally much more robust, e.g., w.r.t. the choice of initialisation schemes.

View on arXiv PDF Code

Similar