Understanding the visual speech signal
This work addresses the problem of improving lipreading for applications in speech therapy, animation, and psychology, but appears incremental in nature.
The paper investigates the visual speech signal to understand visemes, demonstrating how they can be used to boost lipreading performance.
For machines to lipread, or understand speech from lip movement, they decode lip-motions (known as visemes) into the spoken sounds. We investigate the visual speech channel to further our understanding of visemes. This has applications beyond machine lipreading; speech therapists, animators, and psychologists can benefit from this work. We explain the influence of speaker individuality, and demonstrate how one can use visemes to boost lipreading.