SD AI CL ASSep 14, 2023

CiwaGAN: Articulatory information exchange

Gašper Beguš, Thomas Lu, Alan Zhou, Peter Wu, Gopala K. Anumanchipalli

arXiv:2309.07861v15.82 citationsh-index: 25Has Code

Originality Incremental advance

AI Analysis

This work addresses the need for cognitively plausible simulations of human speech acquisition, though it appears incremental by combining existing components.

The paper tackles the problem of modeling human spoken language acquisition by introducing CiwaGAN, which combines unsupervised articulatory modeling with auditory information exchange, resulting in the most realistic deep learning approximation to date.

Humans encode information into sounds by controlling articulators and decode information from sounds using the auditory apparatus. This paper introduces CiwaGAN, a model of human spoken language acquisition that combines unsupervised articulatory modeling with an unsupervised model of information exchange through the auditory modality. While prior research includes unsupervised articulatory modeling and information exchange separately, our model is the first to combine the two components. The paper also proposes an improved articulatory model with more interpretable internal representations. The proposed CiwaGAN model is the most realistic approximation of human spoken language acquisition using deep learning. As such, it is useful for cognitively plausible simulations of the human speech act.

View on arXiv PDF Code

Similar