GRCLHCSDASSep 20, 2024

Sketching With Your Voice: "Non-Phonorealistic" Rendering of Sounds via Vocal Imitation

arXiv:2409.13507v11 citationsh-index: 31
Originality Incremental advance
AI Analysis

This work addresses the challenge of auditory representation for applications in computer graphics and human-computer interaction, though it is incremental in building on existing vocal modeling and communication theories.

The paper tackles the problem of automatically generating human-like vocal imitations of sounds by developing a method that combines a vocal tract model with cognitive theories of communication, resulting in improved alignment with human intuitions compared to feature-matching alone.

We present a method for automatically producing human-like vocal imitations of sounds: the equivalent of "sketching," but for auditory rather than visual representation. Starting with a simulated model of the human vocal tract, we first try generating vocal imitations by tuning the model's control parameters to make the synthesized vocalization match the target sound in terms of perceptually-salient auditory features. Then, to better match human intuitions, we apply a cognitive theory of communication to take into account how human speakers reason strategically about their listeners. Finally, we show through several experiments and user studies that when we add this type of communicative reasoning to our method, it aligns with human intuitions better than matching auditory features alone does. This observation has broad implications for the study of depiction in computer graphics.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes