Linguists should learn to love speech-based deep learning models
This is an incremental proposal for linguists to expand their methods by incorporating speech-based models.
The authors argue that the focus on text-based large language models limits interactions with linguistics, and propose that audio-based deep learning models should play a crucial role to address questions beyond written text.
Futrell and Mahowald present a useful framework bridging technology-oriented deep learning systems and explanation-oriented linguistic theories. Unfortunately, the target article's focus on generative text-based LLMs fundamentally limits fruitful interactions with linguistics, as many interesting questions on human language fall outside what is captured by written text. We argue that audio-based deep learning models can and should play a crucial role.