CVASOct 3, 2017

Understanding the visual speech signal

arXiv:1710.01351v1
AI Analysis

This work addresses the problem of improving lipreading for applications in speech therapy, animation, and psychology, but appears incremental in nature.

The paper investigates the visual speech signal to understand visemes, demonstrating how they can be used to boost lipreading performance.

For machines to lipread, or understand speech from lip movement, they decode lip-motions (known as visemes) into the spoken sounds. We investigate the visual speech channel to further our understanding of visemes. This has applications beyond machine lipreading; speech therapists, animators, and psychologists can benefit from this work. We explain the influence of speaker individuality, and demonstrate how one can use visemes to boost lipreading.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes