CVLGIVSep 28, 2020

A Study on Lip Localization Techniques used for Lip reading from a Video

arXiv:2009.13420v15 citations
Originality Synthesis-oriented
AI Analysis

This work addresses lip reading for automatic speech recognition in noisy or audio-absent communication systems, but it appears incremental as it builds on existing techniques.

The paper reviews and compares various lip localization techniques for lip reading from video, proposing a new approach based on the discussed methods to handle asymmetric lips and mouths with visible teeth, tongue, or moustache.

In this paper some of the different techniques used to localize the lips from the face are discussed and compared along with its processing steps. Lip localization is the basic step needed to read the lips for extracting visual information from the video input. The techniques could be applied on asymmetric lips and also on the mouth with visible teeth, tongue & mouth with moustache. In the process of Lip reading the following steps are generally used. They are, initially locating lips in the first frame of the video input, then tracking the lips in the following frames using the resulting pixel points of initial step and at last converting the tracked lip model to its corresponding matched letter to give the visual information. A new proposal is also initiated from the discussed techniques. The lip reading is useful in Automatic Speech Recognition when the audio is absent or present low with or without noise in the communication systems. Human Computer communication also will require speech recognition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes