Artimate: an articulatory animation framework for audiovisual speech synthesis
This work provides a tool for creating more realistic virtual characters in applications such as gaming or speech synthesis, but it is incremental as it adapts existing skeletal animation methods to a specific domain.
The authors tackled the problem of generating realistic articulatory animations for audiovisual speech synthesis by developing a modular framework that uses speech motion capture data to animate a 3D vocal tract model, resulting in a portable resource that can be integrated into platforms like game engines.
We present a modular framework for articulatory animation synthesis using speech motion capture data obtained with electromagnetic articulography (EMA). Adapting a skeletal animation approach, the articulatory motion data is applied to a three-dimensional (3D) model of the vocal tract, creating a portable resource that can be integrated in an audiovisual (AV) speech synthesis platform to provide realistic animation of the tongue and teeth for a virtual character. The framework also provides an interface to articulatory animation synthesis, as well as an example application to illustrate its use with a 3D game engine. We rely on cross-platform, open-source software and open standards to provide a lightweight, accessible, and portable workflow.