Lip Localization and Viseme Classification for Visual Speech Recognition
This work tackles the challenge of enhancing communication systems and accessibility for people with hearing impairments or mobility issues, though it appears incremental as it builds on existing lip-reading methods.
The paper addresses the problem of visual speech recognition by focusing on lip localization and viseme classification, aiming to improve systems for multimedia applications and assistive technologies for individuals with special needs.
The need for an automatic lip-reading system is ever increasing. Infact, today, extraction and reliable analysis of facial movements make up an important part in many multimedia systems such as videoconference, low communication systems, lip-reading systems. In addition, visual information is imperative among people with special needs. We can imagine, for example, a dependent person ordering a machine with an easy lip movement or by a simple syllable pronunciation. Moreover, people with hearing problems compensate for their special needs by lip-reading as well as listening to the person with whome they are talking.