Progress in animation of an EMA-controlled tongue model for acoustic-visual speech synthesis
This work addresses the problem of realistic tongue animation in talking heads for speech synthesis applications, but it appears incremental as it builds on existing methods with new data integration.
The paper tackled the animation of a 3D kinematic tongue model for acoustic-visual speech synthesis by adapting skeletal animation with a deformable rig controlled by EMA motion capture data and MRI-derived surface extraction, presenting initial results.
We present a technique for the animation of a 3D kinematic tongue model, one component of the talking head of an acoustic-visual (AV) speech synthesizer. The skeletal animation approach is adapted to make use of a deformable rig controlled by tongue motion capture data obtained with electromagnetic articulography (EMA), while the tongue surface is extracted from volumetric magnetic resonance imaging (MRI) data. Initial results are shown and future work outlined.