Neuro-SERKET: Development of Integrative Cognitive System through the Composition of Deep Probabilistic Generative Models
This work addresses the challenge of building complex AI systems by integrating diverse models, though it appears incremental as an extension of the existing SERKET framework.
The paper tackles the problem of developing integrative cognitive systems by proposing Neuro-SERKET, a framework that composes deep probabilistic generative models to enable unsupervised learning across modules, and demonstrates its validity through a multimodal categorization task with image and speech data.
This paper describes a framework for the development of an integrative cognitive system based on probabilistic generative models (PGMs) called Neuro-SERKET. Neuro-SERKET is an extension of SERKET, which can compose elemental PGMs developed in a distributed manner and provide a scheme that allows the composed PGMs to learn throughout the system in an unsupervised way. In addition to the head-to-tail connection supported by SERKET, Neuro-SERKET supports tail-to-tail and head-to-head connections, as well as neural network-based modules, i.e., deep generative models. As an example of a Neuro-SERKET application, an integrative model was developed by composing a variational autoencoder (VAE), a Gaussian mixture model (GMM), latent Dirichlet allocation (LDA), and automatic speech recognition (ASR). The model is called VAE+GMM+LDA+ASR. The performance of VAE+GMM+LDA+ASR and the validity of Neuro-SERKET were demonstrated through a multimodal categorization task using image data and a speech signal of numerical digits.